pith. sign in

arxiv: 2505.09067 · v2 · pith:UVOUE3LXnew · submitted 2025-05-14 · 🧮 math.OC · cs.RO· cs.SY· eess.SY

Solving Reach- and Stabilize-Avoid Problems Using Discounted Reachability

Pith reviewed 2026-05-22 16:20 UTC · model grok-4.3

classification 🧮 math.OC cs.ROcs.SYeess.SY
keywords reach-avoidstabilize-avoidHamilton-Jacobi reachabilityvalue functionviscosity solutionnonlinear systemszero-sum gamesdiscounted reachability
0
0 comments X

The pith

A new Lipschitz continuous value function exactly identifies the states from which a controller can reach a target without violating constraints despite worst-case disturbances.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper develops a discounted formulation for infinite-horizon reach-avoid and stabilize-avoid zero-sum games on general nonlinear continuous-time systems. The central construction is a Lipschitz continuous reach-avoid value function whose zero sublevel set precisely describes the set of states that can be driven to a target while staying safe under adversarial disturbance. The authors prove the associated Bellman backup operator is contractive and that the value function is the unique viscosity solution of the corresponding Hamilton-Jacobi variational inequality. They then combine the reach-avoid strategy with a robust control Lyapunov-value function to create a two-step method that also guarantees long-term stability once the target is reached. A numerical example on a 3D Dubins car confirms the approach yields computable safe sets for practical systems.

Core claim

We address the reach-avoid problem by designing a new Lipschitz continuous reach-avoid value function whose zero sublevel set exactly characterizes the reach-avoid set. We establish that the associated Bellman backup operator is contractive and that the reach-avoid value function is the unique viscosity solution of a Hamilton-Jacobi variational inequality. For the stabilize-avoid problem we develop a two-step framework that integrates our reach-avoid strategies with a robust control Lyapunov-value function to ensure both target reachability and long-term stability.

What carries the argument

The Lipschitz continuous reach-avoid value function, whose zero sublevel set encodes the desired safe reachable states and which solves the Hamilton-Jacobi variational inequality uniquely as a viscosity solution.

Load-bearing premise

A Lipschitz continuous reach-avoid value function exists whose zero sublevel set exactly coincides with the true reach-avoid set under the discounted formulation for arbitrary nonlinear continuous-time systems.

What would settle it

A concrete nonlinear system together with target and unsafe sets for which the numerically computed zero sublevel set either contains a state from which no control avoids the unsafe region while reaching the target, or excludes a state where such a control exists.

Figures

Figures reproduced from arXiv: 2505.09067 by Boyang Li, Sylvia Herbert, Zheng Gong.

Figure 1
Figure 1. Figure 1: SA value function V SA γ (x) and its level sets. The regions enclosed by the green and black solid lines are the target T and obstacles, respectively, so the region outside the black solid lines is the constraint set C. The regions enclosed by the magenta dashed lines indicate the zero superlevel set of V SA γ (x), so the regions outside compose the SA(T , C) set [PITH_FULL_IMAGE:figures/full_fig_p008_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: SA (left) and RA (right) trajectories of the 3D Dubins car that [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗
read the original abstract

In this article, we consider the infinite-horizon reach-avoid (RA) and stabilize-avoid (SA) zero-sum game problems for general nonlinear continuous-time systems, where the goal is to find the set of states that can be controlled to reach or stabilize to a target set, without violating constraints even under the worst-case disturbance. Based on the Hamilton-Jacobi reachability method, we address the RA problem by designing a new Lipschitz continuous RA value function, whose zero sublevel set exactly characterizes the RA set. We establish that the associated Bellman backup operator is contractive and that the RA value function is the unique viscosity solution of a Hamilton-Jacobi variational inequality. Finally, we develop a two-step framework for the SA problem by integrating our RA strategies with a recently proposed Robust Control Lyapunov-Value Function, thereby ensuring both target reachability and long-term stability. We numerically verify our RA and SA frameworks on a 3D Dubins car system to demonstrate the efficacy of the proposed approach.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper addresses infinite-horizon reach-avoid (RA) and stabilize-avoid (SA) zero-sum games for general nonlinear continuous-time systems with disturbances. It introduces a new Lipschitz continuous RA value function whose zero sublevel set is asserted to exactly characterize the RA set, shows that the associated Bellman backup operator is contractive, and proves that this value function is the unique viscosity solution of a Hamilton-Jacobi variational inequality. For the SA problem, the RA strategies are combined with a robust control Lyapunov-value function to ensure both reachability and asymptotic stability. The claims are supported by numerical experiments on a 3D Dubins car.

Significance. If the exact zero-sublevel characterization holds under the discounted formulation, the work supplies a contractive operator whose fixed point yields the precise RA set via viscosity solution theory, offering a route to both theoretical guarantees and potentially more stable numerical schemes for infinite-horizon problems. The two-step integration with the robust CLVF for SA problems is a natural and useful extension. The numerical verification on the Dubins car provides concrete evidence of practical utility.

major comments (2)
  1. [§3.2, Definition 3 and Theorem 3.1] §3.2, Definition 3 and Theorem 3.1: The central claim that the zero sublevel set of the proposed Lipschitz RA value function exactly coincides with the reach-avoid set for arbitrary Lipschitz dynamics and bounded disturbances is not yet load-bearingly established. The discounted infinite-horizon cost typically produces a strictly positive value outside the set that only approaches the indicator function as the discount factor tends to zero; the manuscript must supply an explicit argument showing why the chosen running-cost design recovers the exact boundary for any fixed positive discount factor.
  2. [§4] §4, the viscosity-solution uniqueness argument: While contractivity of the Bellman operator is asserted, the proof that this operator maps the space of Lipschitz functions into itself and that the fixed point satisfies the HJ variational inequality with the exact level-set property needs to be checked against possible boundary discrepancies introduced by discounting. A concrete counter-example or a limiting argument should be added if the exactness does not follow directly from contractivity alone.
minor comments (2)
  1. Notation for the target set and constraint set is introduced without a dedicated table or diagram; adding one would improve readability when comparing the RA and SA formulations.
  2. The numerical section would benefit from an explicit statement of the chosen discount factor and a brief sensitivity study showing that the computed level sets remain stable under moderate changes in this parameter.

Simulated Author's Rebuttal

2 responses · 0 unresolved

Thank you for the opportunity to respond to the referee's report. We are grateful for the positive assessment of the significance of our work and for the constructive major comments. Below we address each comment in turn, indicating how we will revise the manuscript to strengthen the presentation and proofs.

read point-by-point responses
  1. Referee: [§3.2, Definition 3 and Theorem 3.1] §3.2, Definition 3 and Theorem 3.1: The central claim that the zero sublevel set of the proposed Lipschitz RA value function exactly coincides with the reach-avoid set for arbitrary Lipschitz dynamics and bounded disturbances is not yet load-bearingly established. The discounted infinite-horizon cost typically produces a strictly positive value outside the set that only approaches the indicator function as the discount factor tends to zero; the manuscript must supply an explicit argument showing why the chosen running-cost design recovers the exact boundary for any fixed positive discount factor.

    Authors: We acknowledge that the current manuscript would benefit from a more explicit derivation of the exact level-set property. The running cost in our formulation is constructed as a positive definite function that vanishes exactly on the target set and is bounded below by a positive constant on the complement of the avoid set. Combined with the discounting, this ensures that the value function is strictly positive outside the reach-avoid set for any fixed discount factor, because any trajectory starting outside must incur a positive integrated cost before reaching the target. In the revised version, we will insert a dedicated lemma following Definition 3 that proves this property using the Lipschitz continuity of the dynamics and the boundedness of the disturbance set. This will make the argument load-bearing as requested. revision: yes

  2. Referee: [§4] §4, the viscosity-solution uniqueness argument: While contractivity of the Bellman operator is asserted, the proof that this operator maps the space of Lipschitz functions into itself and that the fixed point satisfies the HJ variational inequality with the exact level-set property needs to be checked against possible boundary discrepancies introduced by discounting. A concrete counter-example or a limiting argument should be added if the exactness does not follow directly from contractivity alone.

    Authors: We agree that the connection between contractivity, the viscosity solution property, and the exact level set requires additional clarification to rule out boundary effects from discounting. In the revision, we will augment the proof in §4 with a limiting argument as the discount factor is held fixed but the analysis considers the behavior near the boundary of the reach-avoid set. Specifically, we will show that the fixed point of the contractive operator coincides with the unique viscosity solution of the HJVI, and that this solution's zero sublevel set is invariant under the dynamics in the required sense. We will also note that a counter-example would contradict the contraction property in the complete metric space of Lipschitz functions equipped with the sup norm. revision: yes

Circularity Check

0 steps flagged

No significant circularity; core RA value function and operator properties derived independently

full rationale

The paper constructs a new Lipschitz continuous RA value function for the discounted infinite-horizon formulation and proves contractivity of the associated Bellman backup operator plus uniqueness as the viscosity solution to the Hamilton-Jacobi variational inequality. These steps are presented as direct consequences of the chosen running-cost design and discounting for general nonlinear systems. The SA extension integrates the RA results with a cited Robust Control Lyapunov-Value Function but does not make the central RA claims depend on self-referential definitions, fitted inputs renamed as predictions, or load-bearing self-citation chains. The derivation chain remains self-contained against the stated assumptions without reducing the zero-sublevel characterization to an input by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claims rest on standard domain assumptions for nonlinear systems plus the newly introduced value function; no free parameters are explicitly fitted in the abstract.

axioms (1)
  • domain assumption The problems are posed for general nonlinear continuous-time systems with disturbances in a zero-sum game setting.
    Stated directly as the setting for the RA and SA problems in the abstract.
invented entities (1)
  • Lipschitz continuous RA value function no independent evidence
    purpose: To characterize the reach-avoid set exactly via its zero sublevel set and enable contractive Bellman backups.
    Newly designed in the paper for the RA problem.

pith-pipeline@v0.9.0 · 5711 in / 1357 out tokens · 71074 ms · 2026-05-22T16:20:24.236039+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Value Functions for Temporal Logic: Optimal Policies and Safety Filters

    cs.RO 2026-05 unverdicted novelty 6.0

    Non-Markovian policies from decomposed temporal logic value functions are proven optimal for nested Until, Globally, and Globally-Until specifications and extend Q-function safety filters to complex tasks.

Reference graph

Works this paper leans on

27 extracted references · 27 canonical work pages · cited by 1 Pith paper

  1. [1]

    Hamilton–jacobi formulation for reach–avoid differential games,

    K. Margellos and J. Lygeros, “Hamilton–jacobi formulation for reach–avoid differential games,”IEEE Transactions on Automatic Con- trol, vol. 56, no. 8, pp. 1849–1861, 2011

  2. [2]

    A general hamilton-jacobi framework for non-linear state-constrained control problems,

    A. Altarovici, O. Bokanowski, and H. Zidani, “A general hamilton-jacobi framework for non-linear state-constrained control problems,”ESAIM. Control, Optimisation and Calculus of Variations, vol. 19, no. 2, p. 337–357, Apr. 2013

  3. [3]

    Reach-avoid problems with time-varying dynamics, targets and constraints,

    J. F. Fisac, M. Chen, C. J. Tomlin, and S. S. Sastry, “Reach-avoid problems with time-varying dynamics, targets and constraints,” in Proceedings of the 18th International Conference on Hybrid Systems: Computation and Control. New York, NY , USA: ACM, Apr. 2015. [Online]. Available: http://dx.doi.org/10.1145/2728606.2728612

  4. [4]

    Reach-avoid differential games with targets and obstacles depending on controls,

    E. N. Barron, “Reach-avoid differential games with targets and obstacles depending on controls,”Dynamic Games and Applications, vol. 8, no. 4, pp. 696–712, 2018. [Online]. Available: https: //doi.org/10.1007/s13235-017-0235-5

  5. [5]

    A time-dependent hamilton-jacobi formulation of reachable sets for continuous dynamic games,

    I. Mitchell, A. Bayen, and C. Tomlin, “A time-dependent hamilton-jacobi formulation of reachable sets for continuous dynamic games,”IEEE Transactions on Automatic Control, vol. 50, no. 7, pp. 947–957, 2005

  6. [6]

    Reachability- based safety guarantees using efficient initializations,

    S. L. Herbert, S. Bansal, S. Ghosh, and C. J. Tomlin, “Reachability- based safety guarantees using efficient initializations,” in2019 IEEE 58th Conference on Decision and Control (CDC), 2019, pp. 4810–4816

  7. [7]

    Bridging hamilton-jacobi safety analysis and reinforcement learning,

    J. F. Fisac, N. F. Lugovoy, V . Rubies-Royo, S. Ghosh, and C. J. Tomlin, “Bridging hamilton-jacobi safety analysis and reinforcement learning,” in2019 International Conference on Robotics and Automation (ICRA), 2019, pp. 8550–8556

  8. [8]

    Safety and liveness guarantees through reach-avoid reinforcement learning,

    K.-C. Hsu*, V . Rubies-Royo*, C. Tomlin, and J. Fisac, “Safety and liveness guarantees through reach-avoid reinforcement learning,” in Robotics: Science and Systems XVII. Robotics: Science and Systems Foundation, Jul. 2021. [Online]. Available: http://dx.doi.org/10.15607/ rss.2021.xvii.077

  9. [9]

    Sim-to-lab-to-real: Safe reinforcement learning with shielding and generalization guarantees,

    K.-C. Hsu, A. Z. Ren, D. P. Nguyen, A. Majumdar, and J. F. Fisac, “Sim-to-lab-to-real: Safe reinforcement learning with shielding and generalization guarantees,”Artificial Intelligence, vol. 314, p. 103811, 2023. [Online]. Available: https://www.sciencedirect.com/ science/article/pii/S0004370222001515

  10. [10]

    Learning predictive safety filter via decomposition of robust invariant set,

    Z. Li, C. Hu, W. Zhao, and C. Liu, “Learning predictive safety filter via decomposition of robust invariant set,” 2023. [Online]. Available: https://arxiv.org/abs/2311.06769

  11. [11]

    Isaacs: Iterative soft adversarial actor-critic for safety,

    K.-C. Hsu, D. P. Nguyen, and J. F. Fisac, “Isaacs: Iterative soft adversarial actor-critic for safety,” inProceedings of the 5th Annual Learning for Dynamics and Control Conference, ser. Proceedings of Machine Learning Research, N. Matni, M. Morari, and G. J. Pappas, Eds., vol. 211. PMLR, 15–16 Jun 2023. [Online]. Available: https://proceedings.mlr.press/...

  12. [12]

    Magics: Adversarial rl with minimax actors guided by implicit critic stackelberg for convergent neural synthesis of robot safety,

    J. Wang, H. Hu, D. P. Nguyen, and J. F. Fisac, “Magics: Adversarial rl with minimax actors guided by implicit critic stackelberg for convergent neural synthesis of robot safety,” 2024. [Online]. Available: https://arxiv.org/abs/2409.13867

  13. [13]

    Gameplay filters: Robust zero-shot safety through adversarial imagination,

    D. P. Nguyen*, K.-C. Hsu*, W. Yu, J. Tan, and J. F. Fisac, “Gameplay filters: Robust zero-shot safety through adversarial imagination,” in 8th Annual Conference on Robot Learning, 2024. [Online]. Available: https://openreview.net/forum?id=Ke5xrnBFAR

  14. [14]

    Certifiable reachability learning using a new lipschitz continuous value function,

    J. Li, D. Lee, J. Lee, K. S. Dong, S. Sojoudi, and C. Tomlin, “Certifiable reachability learning using a new lipschitz continuous value function,”IEEE Robotics and Automation Letters, vol. 9, no. 2, pp. 1–8, 2024. [Online]. Available: https://arxiv.org/pdf/2408.07866

  15. [15]

    A minimum discounted reward hamilton–jacobi formulation for computing reachable sets,

    A. K. Akametalu, S. Ghosh, J. F. Fisac, V . Rubies-Royo, and C. J. Tomlin, “A minimum discounted reward hamilton–jacobi formulation for computing reachable sets,”IEEE Transactions on Automatic Control, vol. 69, no. 2, pp. 1097–1103, 2024

  16. [16]

    Solving stabilize-avoid optimal control via epigraph form and deep reinforcement learning,

    O. So and C. Fan, “Solving stabilize-avoid optimal control via epigraph form and deep reinforcement learning,” inRobotics: Science and Sys- tems, Daegu, Republic of Korea, July 2023, pp. 10–14

  17. [17]

    Solving reach-avoid-stay problems using deep deterministic policy gradients,

    G. Chenevert, J. Li, A. Kannan, S. Bae, and D. Lee, “Solving reach-avoid-stay problems using deep deterministic policy gradients,” Oct. 2024. [Online]. Available: http://arxiv.org/abs/2410.02898

  18. [18]

    Stabilization with guaranteed safety using control lyapunov–barrier function,

    M. Z. Romdlony and B. Jayawardhana, “Stabilization with guaranteed safety using control lyapunov–barrier function,”Automatica, vol. 66, pp. 39–47, 2016. [Online]. Available: https://www.sciencedirect.com/ science/article/pii/S0005109815005439

  19. [19]

    Smooth converse lyapunov-barrier theorems for asymptotic stability with safety con- straints and reach-avoid-stay specifications,

    Y . Meng, Y . Li, M. Fitzsimmons, and J. Liu, “Smooth converse lyapunov-barrier theorems for asymptotic stability with safety con- straints and reach-avoid-stay specifications,”Automatica, vol. 144, p. 110478, 2022

  20. [20]

    Safe nonlinear control using robust neural lyapunov-barrier functions,

    C. Dawson, Z. Qin, S. Gao, and C. Fan, “Safe nonlinear control using robust neural lyapunov-barrier functions,” inProceedings of the 5th Conference on Robot Learning, ser. Proceedings of Machine Learning Research, A. Faust, D. Hsu, and G. Neumann, Eds., vol

  21. [21]

    1724–1735

    PMLR, 08–11 Nov 2022, pp. 1724–1735. [Online]. Available: https://proceedings.mlr.press/v164/dawson22a.html

  22. [22]

    Robust control lyapunov-value functions for nonlinear disturbed systems,

    Z. Gong and S. Herbert, “Robust control lyapunov-value functions for nonlinear disturbed systems,” 2024. [Online]. Available: https: //arxiv.org/abs/2403.03455

  23. [23]

    Partial differential equations: Second edition,

    L. C. Evans, “Partial differential equations: Second edition,” inPartial Differential Equations: Second Edition, ser. Graduate Studies in Mathe- matics. Providence, RI: American Mathematical Society, 2010, vol. 19

  24. [24]

    The bellman equation for minimizing the maximum cost,

    E. Barron and H. Ishii, “The bellman equation for minimizing the maximum cost,”Nonlinear Analysis: Theory, Methods & Applications, vol. 13, no. 9, pp. 1067–1090, 1989. [Online]. Available: https: //www.sciencedirect.com/science/article/pii/0362546X89900965

  25. [25]

    Bardi and I

    M. Bardi and I. Capuzzo-Dolcetta,Optimal control and viscosity so- lutions of Hamilton-Jacobi-bellman equations, 1st ed., ser. Modern Birkh¨auser Classics. Cambridge, MA: Birkh ¨auser, May 2009

  26. [26]

    A forward reachability perspective on robust control invariance and discount factors in reachability analysis,

    J. J. Choi, D. Lee, B. Li, J. P. How, K. Sreenath, S. L. Herbert, and C. J. Tomlin, “A forward reachability perspective on robust control invariance and discount factors in reachability analysis,”arXiv preprint arXiv:2310.17180, 2023

  27. [27]

    A toolbox of hamilton-jacobi solvers for analysis of nondeterministic continuous and hybrid systems,

    I. M. Mitchell and J. A. Templeton, “A toolbox of hamilton-jacobi solvers for analysis of nondeterministic continuous and hybrid systems,” inInt. Work. on Hybrid Sys.: Computation and Control. Springer, 2005