pith. sign in

arxiv: 2605.18566 · v1 · pith:M7LXRSPHnew · submitted 2026-05-18 · 📡 eess.SY · cs.SY

HJ-Gauss: A Monte-Carlo HJ Reachability Scheme

Pith reviewed 2026-05-20 09:08 UTC · model grok-4.3

classification 📡 eess.SY cs.SY
keywords Hamilton-Jacobi reachabilityMonte Carlo methodsgrid-free algorithmsbackward reachable tubeshigh-dimensional systemsCole-Hopf transformationviscous PDEspursuit-evasion games
0
0 comments X

The pith

A Cole-Hopf-style transform plus Monte Carlo sampling solves viscous Hamilton-Jacobi reachability without grids or exponential memory.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a grid-free algorithm for backward reachable tubes by locally linearizing the nonlinear viscous Hamilton-Jacobi PDE around frozen coefficients. This linearization permits a generalized Cole-Hopf transformation that converts the problem into a sequence of linear heat equations whose fundamental solutions are Gaussian heat kernels. Value functions and their gradients are recovered as Monte Carlo expectations over trajectories sampled from those kernels. The resulting scheme scales linearly with state dimension and number of samples while supplying a finite-sample concentration bound of order one over square root of N and a conditional linear convergence rate for the Picard iteration. The approach is demonstrated on multi-agent pursuit-evasion games up to 45 dimensions where classical grid solvers are infeasible.

Core claim

A local PDE linearization with frozen coefficients allows a generalized Cole-Hopf-type transformation to reduce the nonlinear viscous HJ equation to a sequence of linear heat equations. Solutions of these heat equations admit explicit Gaussian heat-kernel representations, so the value function and its spatial gradient are recovered exactly as Monte Carlo expectations over roll-outs on the corresponding Gaussian densities. The resulting storage-free algorithm scales as N times n, admits an O(N to the minus one half) finite-sample error bound, and exhibits conditional linear convergence under the Picard iteration.

What carries the argument

Generalized Cole-Hopf-type transformation that maps the nonlinear viscous HJ PDE to a sequence of linear heat equations with Gaussian heat-kernel solutions, followed by Monte Carlo expectation recovery of the value function and gradient.

If this is right

  • Reachability analysis becomes feasible for systems whose state dimension makes grid storage impossible.
  • The algorithm decouples memory cost from dimension, scaling only as N times n.
  • Finite-sample concentration guarantees and conditional linear convergence are available for the Picard scheme.
  • Validation on pursuit-evasion games up to 45 dimensions yields relative L2 errors between 0.03 and 0.20 with modest CPU times.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The sampling approach could be paired with variance-reduction techniques such as importance sampling to tighten error bounds in practice.
  • The same linearization-plus-Monte-Carlo pattern might extend to other nonlinear PDEs arising in robust control or differential games.
  • Real-time safety monitoring of high-dimensional learned policies could become practical once the per-query cost is further reduced.
  • Adaptive choice of linearization points or time-step sizes could relax the frozen-coefficient approximation without losing the grid-free property.

Load-bearing premise

The local linearization with frozen coefficients approximates the original nonlinear viscous HJ PDE well enough over the relevant time horizon and state space.

What would settle it

A low-dimensional benchmark where the Monte Carlo estimates deviate from an accurate grid solution by more than the predicted O(N to the minus one half) rate, or where the computed reachable tube fails to contain the true unsafe set when validated against a trusted solver.

Figures

Figures reproduced from arXiv: 2605.18566 by Lekan Molu, Namhoon Cho, Venkatraman Renganathan.

Figure 1
Figure 1. Figure 1: Rockets BRT slices: Grid-based levelsetpy from [Molu, 2024b] vs. Monte Carlo (ours). The bottom row shows a heat map of the pointwise errors at various relative vehicle orientations θ; errors are largest near the boundary of the usable part where |Dv δ | is largest. Central Contribution: Monte-Carlo Trajectory Optimization of Quasi-Linear BRTs A backward reachable tube (BRT) is the set of initial states fr… view at source ↗
Figure 2
Figure 2. Figure 2: Two Dubins’ vehicles in relative Cartesian coordinates. Reprinted from Mitchell [2020]. C Further Numerical Results This example was originally proposed by Merz [1972] as an iteration upon Isaacs [1999]’s homicidal chauffeur game, whereupon a pursuit-evasion game between two players with similar speeds and minimum turn radii, is thoroughly analyzed. In Mitchell [2020], this problem was established as a ben… view at source ↗
Figure 3
Figure 3. Figure 3: The target set (left) and the boundary of the useable part of the state space after the differential game between P and E [PITH_FULL_IMAGE:figures/full_fig_p025_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Dubins Vehicle BRT Slices ℓ : [−T, 0] × X → R so that for a t ∈ [0, T], where T > 0 is T = {x ∈ X |ℓ(0, x) ≤ 0} (C.3) R([−t, 0], T ) = {x ∈ X |ℓ(t, x) ≤ 0}, (C.4) When t > 0, the implicit surface representation is the following HJI PDE ∂ ∂t ℓ(t, x) + min (0, H(x, ∇xℓ(t, x))) = 0. (C.5) It is easy to verify that the Hamiltonian is H(x, p) = p1(vp cos x3 − ve) − p2(vp sin x3) + w|p1x2 − p2x1 − p3| − w|p3|. (… view at source ↗
Figure 5
Figure 5. Figure 5: State trajectories of the double integral plant. The solid curves with upward-pointing [PITH_FULL_IMAGE:figures/full_fig_p026_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: The time to reach the origin as 2D slice comparison for the double integratral plant [PITH_FULL_IMAGE:figures/full_fig_p027_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Left: Motion of two rockets on a Cartesian xz-plane with a thrust inclination in relative coordinates given by θ := up − ue. Right: Representation of the initial values as a heat map for the two rockets described in (20) [PITH_FULL_IMAGE:figures/full_fig_p028_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Initial and final backward reachable tubes for the rocket system (cf. [PITH_FULL_IMAGE:figures/full_fig_p029_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Dubins Vehicles’ BRT Slices: Levelsetpy vs Monte-Carlo (Ours). The bottom row shows a [PITH_FULL_IMAGE:figures/full_fig_p030_9.png] view at source ↗
read the original abstract

Backward reachable tubes (BRTs), computed via viscous Hamilton-Jacobi (HJ) partial differential equations, provide principled safety certificates for learned controllers and planning algorithms in trustworthy machine learning. However, classical grid-based HJ solvers require $O(M^n)$ memory footprint for $M$ grid points per $n$ state dimension. This renders them impractical for high-dimensional systems. We address this bottleneck with a local PDE linearization that enables a frozen-coefficient sampling scheme for the viscous HJ PDE: a generalized Cole-Hopf-type transformation reduces the nonlinear HJ equation to a sequence of linear heat equations whose solutions admit Gaussian heat-kernel representations. The value function and its spatial gradient are then recovered via roll-outs of Monte Carlo expectations on Gaussian densities, yielding a storage and grid-free algorithm that scales as $N\cdot n$ for $N$ samples. This decoupling of memory from dimensionality enables reachability analysis on problems where grid-based methods are simply impossible. We prove a finite-sample concentration bound $O(N^{-1/2})$ error and conditional linear convergence for the introduced Monte-Carlo Picard iterative scheme. Numerical validation on pursuit-evasion games demonstrates relative $L^2_{\text{rel}}$ errors of $0.03 - 0.20$, with $14-26$ second wall-clock times per 2D slice on a CPU. Crucially, the method scales with validation on up to (but not limited to) $n=45$-dimensional multi-agent games.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces HJ-Gauss, a Monte-Carlo scheme for backward reachable tubes via viscous Hamilton-Jacobi PDEs. A generalized Cole-Hopf-type transformation reduces the nonlinear viscous HJ equation to a sequence of linear heat equations solved by Gaussian heat-kernel Monte Carlo expectations, yielding a grid-free algorithm scaling as N·n. The paper proves an O(N^{-1/2}) finite-sample concentration bound and conditional linear convergence for the Monte-Carlo Picard scheme, with numerical results on pursuit-evasion games showing relative L2 errors of 0.03-0.20 and scaling to n=45.

Significance. If the approximations hold, the work provides a meaningful advance for high-dimensional reachability by decoupling memory from state dimension, enabling safety certificates where grid methods are intractable. Strengths include the machine-checked-style concentration bound, conditional convergence result, parameter-free Monte Carlo recovery of value and gradient, and empirical scaling to 45 dimensions on multi-agent games.

major comments (2)
  1. Abstract and derivation: the local PDE linearization with frozen coefficients is invoked to convert the nonlinear viscous HJ PDE into linear heat equations whose Monte Carlo solutions are claimed to recover the original reachability value function. No quantitative bound on the linearization residual, its propagation under the Picard iteration, or its control over the full time horizon and state space is referenced; this approximation is load-bearing for both the correctness of the safety certificates and the applicability of the O(N^{-1/2}) bound to the target problem.
  2. Numerical validation: relative L2 errors of 0.03-0.20 are reported for the pursuit-evasion examples, yet the manuscript provides no explicit verification or residual measurement confirming that the frozen-coefficient approximation remains sufficiently accurate for those specific systems and horizons, leaving open whether the observed errors reflect the Monte Carlo scheme or the linearization step.
minor comments (2)
  1. Clarify the precise statement of the conditional linear convergence (e.g., under what assumptions on the linearization) and add a short remark on how the Gaussian density roll-outs are implemented for reproducibility.
  2. The wall-clock times (14-26 s per 2D slice) are informative; a brief comparison table against at least one other high-dimensional reachability method would help contextualize the practical gains.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We appreciate the referee's thorough review and constructive feedback on our manuscript. We respond to the major comments point by point below, indicating the changes we plan to make in the revised version.

read point-by-point responses
  1. Referee: Abstract and derivation: the local PDE linearization with frozen coefficients is invoked to convert the nonlinear viscous HJ PDE into linear heat equations whose Monte Carlo solutions are claimed to recover the original reachability value function. No quantitative bound on the linearization residual, its propagation under the Picard iteration, or its control over the full time horizon and state space is referenced; this approximation is load-bearing for both the correctness of the safety certificates and the applicability of the O(N^{-1/2}) bound to the target problem.

    Authors: We agree that providing a quantitative bound on the linearization residual would strengthen the theoretical foundation. The local PDE linearization with frozen coefficients is a key step to apply the generalized Cole-Hopf transformation, reducing the problem to solvable linear heat equations. Deriving a tight global bound is non-trivial because the approximation error depends on the variation of the coefficients over the domain and time. In the revision, we will include a detailed discussion of the approximation error, including a local residual estimate and analysis of its propagation through the Picard iteration. We will also clarify the conditions under which the O(N^{-1/2}) bound applies to the approximated problem. revision: yes

  2. Referee: Numerical validation: relative L2 errors of 0.03-0.20 are reported for the pursuit-evasion examples, yet the manuscript provides no explicit verification or residual measurement confirming that the frozen-coefficient approximation remains sufficiently accurate for those specific systems and horizons, leaving open whether the observed errors reflect the Monte Carlo scheme or the linearization step.

    Authors: We concur that distinguishing the sources of error is important for validating the method. The reported relative L2 errors include both the Monte Carlo sampling variance and any linearization approximation error. To address this, we will add numerical results in the revised manuscript that compute the residual of the frozen-coefficient approximation for the pursuit-evasion games at the relevant time horizons and state points. This will help quantify the contribution of each component to the total error. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained via PDE transformation and sampling

full rationale

The paper's central chain begins with the viscous HJ PDE, applies a local frozen-coefficient linearization (an explicit modeling assumption), invokes a generalized Cole-Hopf transformation to obtain linear heat equations, and recovers the solution via Monte Carlo expectations over Gaussian kernels. Each step is a standard first-principles reduction or probabilistic representation theorem applied to the transformed linear PDE; the Monte Carlo estimator computes quantities defined by the linear heat kernel and is not fitted to the original nonlinear reachability values. The finite-sample concentration bound follows from standard concentration inequalities on the sampling scheme itself. No step reduces by definition or self-citation to the target BRT outputs, and numerical examples serve as external validation rather than circular fitting. The method is therefore an approximation scheme whose correctness rests on the validity of the linearization assumption, not on internal redefinition.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The central claim rests on the validity of the generalized Cole-Hopf reduction for the viscous HJ PDE and on the accuracy of the frozen-coefficient approximation; no new physical entities are introduced and the only tunable quantity is the sample count N chosen for accuracy.

free parameters (1)
  • number of samples N
    User-chosen parameter that controls the Monte-Carlo error; not fitted to reachability data but selected for desired precision.
axioms (2)
  • domain assumption The viscous Hamilton-Jacobi PDE admits a generalized Cole-Hopf transformation that reduces it to a sequence of linear heat equations with Gaussian kernels.
    Invoked to obtain the linear heat equations whose solutions are sampled.
  • domain assumption The local linearization with frozen coefficients remains a sufficiently accurate approximation over the relevant time and state domain.
    Required for the sampling scheme to converge to the original nonlinear PDE solution.

pith-pipeline@v0.9.0 · 5802 in / 1463 out tokens · 46477 ms · 2026-05-20T09:08:24.474120+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

12 extracted references · 12 canonical work pages

  1. [1]

    Mitchell

    I. Mitchell. A toolbox of level set methods, version 1.0.The University of British Columbia, UBC CS TR-2004-09, pages 1–94, July

  2. [2]

    ISSN 00189286. L. Molu. The Python LevelSet Toolbox (LevelSetPy). In2024 IEEE 63rd Conference on Decision and Control (CDC), pages 8938–8945, 2024a. doi: 10.1109/CDC56724.2024.10886640. L. Molu. The python levelset toolbox (levelsetpy). InIEEE 63rd Conference on Decision and Control (CDC), pages 8938–8945, 2024b. doi: 10.1109/CDC56724.2024.10886640. L. Mo...

  3. [3]

    min-max" or “max- min

    12 Appendix Contents Contents Appendix Contents. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .13 A Background and Preliminaries (Unabridged).. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .14 A.1 Notations and Terminologies . . . . . . . . ...

  4. [4]

    Following 2Time is reversed in BRT computational scenarios

    ={x∈R n | ∃β∈ B(t)∀u∈ U(t),∃¯τ∈[−T,0],ξ(¯τ)∈ L 0}, (Target-Tube) Read: The set of states from which there exists a strategy of P such that, for all controls of E, the resulting trajectoryreaches and remains in the target setwithin the interval [−T,0] . Following 2Time is reversed in BRT computational scenarios. 16 Lemma 2 of [Mitchell et al., 2005], the s...

  5. [5]

    Using assumption (iii), the first term is bounded by |H(t;x m, p)−H(t;x m, q)| |p|2 ≤ LH m2 0 |p−q|.(B.38) For the second term, note that 1 |p|2 − 1 |q|2 = |q|2 − |p|2 |p|2|q|2 ≤ |q| − |p| (|q|+|p|) m4 0 ≤ 2P∗ m4 0 |p−q|,(B.39) where the final inequality uses |p|,|q| ≤P ∗ from assumption (ii). Using |H(t;x m, q)| ≤H ∗ from assumption (iii), we conclude th...

  6. [6]

    as an iteration upon Isaacs [1999]’s homicidal chauffeur game, whereupon a pursuit-evasion game between two players with similar speeds and minimum turn radii, is thoroughly analyzed. In Mitchell [2020], this problem was established as a benchmark for testing the solubility of capturable set of states (the backward reachable tube) in Merz’s classical purs...

  7. [7]

    Capture occurs when the distance ∥P E∥2 between the pursuer and the evader becomes less than a specified radius

    Choosing the Cartesian coordinate for motion representation, the state vector of the game with E at the origin can be characterized by its x1, x2 position relative to P and the angle θ between the two vehicles. Capture occurs when the distance ∥P E∥2 between the pursuer and the evader becomes less than a specified radius. The relative equations of motion,...

  8. [8]

    Here, we focus on the construction of the BUP

    For a detailed treatment of the barrier surface, we refer readers to a proper analysis as elucidated in Mitchell [2020]. Here, we focus on the construction of the BUP. The set of states that constitute the useable part and its boundary are respectively a function of the implicit surface function representation 24 Figure 3:The target set (left) and the bou...

  9. [9]

    trans- port

    C.1 The Double Integral Plant Here, we analyze a time-optimal control problem to determine what admissible control4 can “trans- port" the system under consideration to a desired “origin" in the shortest possible time. We consider the double integral plant Bhat and Bernstein [1998], Wonham

  10. [10]

    We shall conclude the section by comparing the analytic and the overapproximated numerical solution (using the LevelSetPy toolbox) to the time to reach the origin problem

    to obtain a time-optimal feedback control design; introduce the notion of isochrones and switching surfaces; and discuss the analytic and approximate solutions (with our library) to the time-optimal control problem for a double integrator. We shall conclude the section by comparing the analytic and the overapproximated numerical solution (using the LevelS...

  11. [11]

    which is to launch a rocket in fixed time to a desired altitude, given a final vertical velocity component and a maximum final horizontal component as constraints. The rocket’s motion is dictated by the following differential equations (under Dreyfus’ assumptions) ˙x1 =x 3;x 1(t0) = 0;(C.13a) ˙x2 =x 4, x 2(t0) = 0;(C.13b) ˙x3 =acosu, x 3(t0) = 0;(C.13c) ˙...

  12. [12]

    We seta e =a p = 64f t/sec2 andg= 32f t/sec 2 as in Dreyfus’ original example

    computed using the method outlined in Evans and Souganidis [1984], Osher and Sethian [1988], Mitchell [2004]. We seta e =a p = 64f t/sec2 andg= 32f t/sec 2 as in Dreyfus’ original example. We compute the reachable set by optimizing for the paths of slowest-quickest descent in equation (C.18). Suppose that the maximizing ue is ¯ue and the minimizing up is ...