Importance Sampling for Statistical Certification of Viable Initial Sets

Elizabeth Dietrich; Hanna Krasowski; Murat Arcak; Vegard Flovik

arxiv: 2604.02939 · v1 · submitted 2026-04-03 · 📡 eess.SY · cs.SY

Importance Sampling for Statistical Certification of Viable Initial Sets

Elizabeth Dietrich , Hanna Krasowski , Vegard Flovik , Murat Arcak This is my paper

Pith reviewed 2026-05-13 19:40 UTC · model grok-4.3

classification 📡 eess.SY cs.SY

keywords importance samplingviable initial setsempirical Bernstein inequalityfinite-sample guaranteesstatistical certificationcontrol specificationssimulation-based verificationrare-event estimation

0 comments

The pith

An empirical Bernstein inequality for weighted variables enables importance sampling to certify viable initial sets with finite-sample guarantees.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper develops a simulation-based approach to certify viable initial sets, collections of starting conditions whose trajectories meet a given control specification. It employs importance sampling to direct simulations toward the scarce regions where violations are likely instead of sampling uniformly from the full space. To support the estimators, the authors derive an empirical Bernstein inequality that applies to the weighted samples arising from importance sampling. This produces rigorous finite-sample probabilistic bounds on the violation probability even when the underlying model is high-fidelity or black-box. A sympathetic reader cares because the method supplies statistical certificates without depending on the simplified models that conventional analysis often requires.

Core claim

We propose a simulation-based framework that employs importance sampling to estimate the probability of specification violations under high-fidelity models. By deriving an empirical Bernstein inequality for weighted random variables, we obtain finite-sample guarantees for the resulting estimators. The approach targets high-risk regions to efficiently detect scarce violations and is demonstrated on systems including an Adaptive Cruise Control benchmark with improved convergence of the bounds.

What carries the argument

Empirical Bernstein inequality for weighted random variables, which supplies concentration bounds adapted to the non-uniform weights produced by importance sampling.

If this is right

The resulting bounds on violation probability converge with fewer samples than uniform Monte Carlo when the importance distribution successfully focuses on rare events.
Viable initial sets become certifiable for high-fidelity or black-box models where analytic or simplified-model methods cannot be applied.
The statistical guarantees are non-asymptotic and therefore hold after any finite number of simulations.
The same weighted concentration result applies to other rare-event estimation tasks in control systems and improves bound tightness on benchmarks such as adaptive cruise control.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the importance distribution itself is adapted during sampling, the method could reach the same bound tightness with still fewer total trajectories.
The weighted Bernstein construction may transfer directly to certifying reachability or invariance properties beyond the specification-violation setting.
Analogous weighted inequalities could improve sample efficiency in rare-event problems outside control, such as reliability analysis or safety verification in robotics.

Load-bearing premise

An effective importance sampling distribution can be chosen or adapted to sufficiently target high-risk regions where specification violations occur without introducing bias that invalidates the finite-sample bounds.

What would settle it

A concrete counter-example in which the derived inequality is violated by the true violation probability for some choice of importance weights, or a simulation experiment on a system with known exact violation rate where the certified upper bound fails to contain the true rate at the claimed .

Figures

Figures reproduced from arXiv: 2604.02939 by Elizabeth Dietrich, Hanna Krasowski, Murat Arcak, Vegard Flovik.

**Figure 1.** Figure 1: ACC candidate VIS C (blue) and learned failure-prone set F (red) with zoom in of boundary region. The over-approximative F does not capture all failure points (×), due to uncertainty in the GP estimate. V. EMPIRICAL RESULTS We demonstrate Alg. 1 on quadrotor flight control and an Adaptive Cruise Control (ACC) braking scenario. The low-dimensional ACC system admits an analytical estimate of the candidate VI… view at source ↗

**Figure 2.** Figure 2: Top: ACC ϵ convergence to true failure probability. Bottom: Convergence rate of ACC estimators. The IS estimator converges rapidly to the true failure probability, closely following a −1.0 slope, whereas the binomial estimator converges with more variance at a rate close to −0.5. In this case study, α has a negligible effect on the IS estimator. −0.4 −0.2 0 0.2 0.4 −0.4 −0.2 0 0.2 0.4 h [m] ˙h [m/s] F −0.5… view at source ↗

**Figure 3.** Figure 3: Quadrotor GP surrogate model with failures and the learned failure [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: Nominal, surrogate, and proposed distributions: the defensive mixture increases the probability of sampling from the failure region. [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

read the original abstract

We study the problem of statistically certifying viable initial sets (VISs) -- sets of initial conditions whose trajectories satisfy a given control specification. While VISs can be obtained from model-based methods, these methods typically rely on simplified models. We propose a simulation-based framework to certify VISs by estimating the probability of specification violations under a high-fidelity or black-box model. Since detecting these violations may be challenging due to their scarcity, we propose a sample-efficient framework that leverages importance sampling to target high-risk regions. We derive an empirical Bernstein inequality for weighted random variables, enabling finite-sample guarantees for importance sampling estimators. We demonstrate the effectiveness of the proposed approach on two systems and show improved convergence of the resulting bounds on an Adaptive Cruise Control benchmark.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper derives an empirical Bernstein inequality for weighted variables to support importance sampling in certifying viable initial sets from black-box simulations, but the finite-sample guarantees look shaky if likelihood ratios are unbounded.

read the letter

The core contribution here is a simulation-only method to certify viable initial sets by estimating violation probabilities with importance sampling, plus a custom empirical Bernstein bound for the resulting weighted estimators. This targets rare specification violations in control systems where model-based approaches fall short, and they show faster convergence on the adaptive cruise control benchmark compared to plain Monte Carlo sampling. The combination of IS with a tailored concentration inequality for this certification task is the main technical step forward, and it is presented as independent of fitted parameters. That part is useful for people who need non-asymptotic guarantees from black-box trajectories. The proposal distribution remains a free parameter, which is honest but shifts the practical burden onto the user to pick one that hits the violation regions without blowing up the weights. The soft spot is exactly the one flagged in the stress test. Standard empirical Bernstein inequalities require bounded random variables or explicit moment conditions, yet the importance sampling summands involve likelihood ratios that are typically unbounded when the proposal oversamples low-probability violation events. The abstract claims a derivation for weighted variables, but without seeing explicit truncation, a uniform bound, or a variance-only version in the full text, it is unclear whether the finite-sample guarantee actually holds for the general black-box models they target. If the derivation silently assumes bounded weights, the claimed certificates could be invalid in practice. This work is aimed at researchers in statistical verification and safety-critical control who already use Monte Carlo methods and want sample efficiency. It is coherent enough on its own terms to deserve a serious referee, mainly to pressure-test the inequality derivation and the conditions under which the bounds apply. I would send it to review rather than desk reject.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes a simulation-based framework for statistically certifying viable initial sets (VISs) in control systems by estimating the probability of specification violations using importance sampling to efficiently target rare events in high-fidelity or black-box models. A key contribution is the derivation of an empirical Bernstein inequality for weighted random variables, which provides finite-sample guarantees for the importance sampling estimators. The approach is demonstrated on two systems, showing improved convergence of bounds on an Adaptive Cruise Control benchmark.

Significance. If the derived inequality holds under the stated conditions, this provides a valuable tool for obtaining rigorous statistical certificates for safety properties in complex dynamical systems where analytical methods are intractable. The use of importance sampling addresses the sample inefficiency of standard Monte Carlo for rare violation events, and the finite-sample bounds enable certification without relying on asymptotic approximations. This could have significant impact in safety-critical applications like autonomous vehicles.

major comments (2)

[Derivation of the empirical Bernstein inequality] The derivation of the empirical Bernstein inequality for weighted random variables (presented as the central technical contribution) does not appear to explicitly address the case of unbounded likelihood ratios p/q. Standard empirical Bernstein bounds require bounded summands or explicit moment conditions; in the IS setting the effective variables are I(violation) * (p/q), which are typically unbounded when q targets rare violation regions. Please clarify in the derivation section whether truncation, a uniform bound, or a finite-variance-only version is used, and verify that all conditions hold for the black-box models targeted.
[Benchmarks and numerical results] In the experimental section on benchmarks, details are missing on how the proposal distribution q is chosen or adapted, whether weights are truncated in practice, and verification that the inequality conditions are satisfied in the reported runs. Without these, it is unclear if the improved convergence bounds are valid or if bias from the IS distribution invalidates the finite-sample guarantees.

minor comments (2)

[Notation and definitions] The notation for the weighted estimators and importance weights could be made more explicit, e.g., by adding a short table of symbols or a dedicated definitions paragraph early in the paper.
[Introduction] A few sentences on related work in statistical verification and importance sampling for rare-event estimation would help situate the contribution.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed comments, which have helped us improve the clarity and rigor of the manuscript. We address each major comment below and have made corresponding revisions to the paper.

read point-by-point responses

Referee: [Derivation of the empirical Bernstein inequality] The derivation of the empirical Bernstein inequality for weighted random variables (presented as the central technical contribution) does not appear to explicitly address the case of unbounded likelihood ratios p/q. Standard empirical Bernstein bounds require bounded summands or explicit moment conditions; in the IS setting the effective variables are I(violation) * (p/q), which are typically unbounded when q targets rare violation regions. Please clarify in the derivation section whether truncation, a uniform bound, or a finite-variance-only version is used, and verify that all conditions hold for the black-box models targeted.

Authors: We thank the referee for highlighting this important point. The derivation in Section 3 assumes that the effective weighted variables (indicator times likelihood ratio) have bounded range, which is a standard condition for empirical Bernstein inequalities to hold with the stated finite-sample guarantees. This boundedness is achieved by selecting proposal distributions q with support that ensures p/q remains bounded in the relevant regions (e.g., via compact support or adaptive proposals that avoid extreme ratios). We have added an explicit paragraph in the derivation section clarifying this assumption, noting that truncation of weights can be applied if needed for black-box models, and confirming that the finite-variance condition is satisfied under these choices. The black-box setting is handled by treating the model as a simulator that allows evaluation of the indicator without requiring analytic bounds beyond the proposal design. revision: yes
Referee: [Benchmarks and numerical results] In the experimental section on benchmarks, details are missing on how the proposal distribution q is chosen or adapted, whether weights are truncated in practice, and verification that the inequality conditions are satisfied in the reported runs. Without these, it is unclear if the improved convergence bounds are valid or if bias from the IS distribution invalidates the finite-sample guarantees.

Authors: We agree that these implementation details are essential for validating the guarantees. We have substantially expanded the experimental section (now Section 5) to describe: the construction of q as a mixture of nominal dynamics and targeted high-risk regions derived from a low-fidelity model; that no weight truncation was applied in the reported runs (as the chosen q kept ratios bounded); and explicit post-hoc verification that the empirical variance and range conditions of the inequality held in all simulations. These additions confirm that the reported bounds are valid under the finite-sample guarantees without introducing bias that would invalidate the certificates. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical Bernstein derivation for weighted variables is presented as independent mathematical contribution

full rationale

The paper's core technical step is the derivation of an empirical Bernstein inequality tailored to weighted random variables arising from importance sampling. This is explicitly framed as a new finite-sample guarantee rather than a re-labeling of fitted quantities, a self-citation chain, or an ansatz imported from prior work by the same authors. No equations or claims in the provided abstract reduce the bound to its own inputs by construction; the derivation is treated as self-contained mathematical work that stands apart from the subsequent application to viable initial set certification. The reader's assessment of score 2.0 aligns with this, as the inequality is not statistically forced by data fitting or self-referential definitions.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The central claim rests on standard simulation assumptions plus the new inequality; no invented physical entities are introduced.

free parameters (1)

importance sampling proposal distribution
The distribution used to bias sampling toward high-risk regions must be selected or adapted; its parameters are free choices that affect estimator variance and bound tightness.

axioms (2)

domain assumption High-fidelity model trajectories can be accurately and repeatedly simulated from any initial condition
Required for generating the weighted samples used in the certification procedure.
domain assumption Specification violations are sufficiently rare that importance sampling can meaningfully reduce variance
Underpins the motivation and efficiency claims for the sampling method.

pith-pipeline@v0.9.0 · 5424 in / 1333 out tokens · 36475 ms · 2026-05-13T19:40:08.492776+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

Foundation.RealityFromDistinction reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We derive an empirical Bernstein inequality for weighted random variables

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

33 extracted references · 33 canonical work pages · 1 internal anchor

[1]

Set invariance in control,

F. Blanchini, “Set invariance in control,”Automatica, vol. 35, no. 11, pp. 1747–1767, 1999

work page 1999
[2]

Safety verification of hybrid systems us- ing barrier certificates,

S. Prajna and A. Jadbabaie, “Safety verification of hybrid systems us- ing barrier certificates,” inInternational workshop on hybrid systems: Computation and control, 2004, pp. 477–492

work page 2004
[3]

A time-dependent Hamilton–Jacobi formulation of reachable sets for continuous dynamic games,

I. M. Mitchell, A. M. Bayen, and C. J. Tomlin, “A time-dependent Hamilton–Jacobi formulation of reachable sets for continuous dynamic games,”IEEE Transactions on Automatic Control, vol. 50, no. 7, pp. 947–957, 2005

work page 2005
[4]

A. B. Kurzhanski and P. Varaiya,Ellipsoidal Techniques for Reacha- bility Analysis. Springer, 2006

work page 2006
[5]

Aubin,Viability Theory, 2nd ed

J.-P. Aubin,Viability Theory, 2nd ed. Boston: Birkh ¨auser, 2009

work page 2009
[6]

Tabuada,Verification and Control of Hybrid Systems: A Symbolic Approach

P. Tabuada,Verification and Control of Hybrid Systems: A Symbolic Approach. Springer, 2009

work page 2009
[7]

Belta, B

C. Belta, B. Yordanov, and E. Gol,Formal Methods for Discrete-Time Dynamical Systems. Springer, 2017

work page 2017
[8]

Funnel libraries for real-time robust feedback motion planning,

A. Majumdar and R. Tedrake, “Funnel libraries for real-time robust feedback motion planning,”The International Journal of Robotics Research, vol. 36, no. 8, pp. 947–982, 2017

work page 2017
[9]

Simulation-Based Approaches for Verification of Embedded Control Systems: An Overview of Traditional and Advanced Modeling, Test- ing, and Verification Techniques,

J. Kapinski, J. V . Deshmukh, X. Jin, H. Ito, and K. R. Butts, “Simulation-Based Approaches for Verification of Embedded Control Systems: An Overview of Traditional and Advanced Modeling, Test- ing, and Verification Techniques,”IEEE Control Systems Magazine, vol. 36, no. 6, pp. 45–64, 2016

work page 2016
[10]

Testing Cyber-Physical Systems through Bayesian Optimization,

J. Deshmukh, M. Horvat, X. Jin, R. Majumdar, and V . S. Prabhu, “Testing Cyber-Physical Systems through Bayesian Optimization,” ACM Trans. Embed. Comput. Syst., vol. 16, no. 5s, 2017

work page 2017
[11]

Falsification-Based Robust Ad- versarial Reinforcement Learning,

X. Wang, S. Nair, and M. Althoff, “Falsification-Based Robust Ad- versarial Reinforcement Learning,” in19th IEEE International Con- ference on Machine Learning and Applications, 2020, pp. 205–212

work page 2020
[12]

Falsification via Barrier Certificates,

V . Murali, A. Trivedi, and M. Zamani, “Falsification via Barrier Certificates,” inAmerican Control Conference, 2024, pp. 4657–4662

work page 2024
[13]

The Probabilistic Model Checking Landscape,

J.-P. Katoen, “The Probabilistic Model Checking Landscape,” in Proceedings of the 31st Annual ACM/IEEE Symposium on Logic in Computer Science, 2016, p. 31–45

work page 2016
[14]

PRISM: A Tool for Automatic Verification of Probabilistic Systems,

A. Hinton, M. Kwiatkowska, G. Norman, and D. Parker, “PRISM: A Tool for Automatic Verification of Probabilistic Systems,” inTools and Algorithms for the Construction and Analysis of Systems. Springer, 2006, pp. 441–444

work page 2006
[15]

A Survey of Algorithms for Black-Box Safety Validation of Cyber- Physical Systems,

A. Corso, R. J. Moss, M. Koren, R. Lee, and M. J. Kochenderfer, “A Survey of Algorithms for Black-Box Safety Validation of Cyber- Physical Systems,”Journal of Artificial Intelligence Research, vol. 72, pp. 377–428, 2021

work page 2021
[16]

Scalable End-to-End Autonomous Vehicle Testing via Rare-event Simulation,

M. O’Kelly, A. Sinha, H. Namkoong, R. Tedrake, and J. C. Duchi, “Scalable End-to-End Autonomous Vehicle Testing via Rare-event Simulation,” inAdvances in Neural Information Processing Systems, vol. 31, 2018

work page 2018
[17]

Improving Aircraft Collision Risk Estimation Using the Cross-Entropy Method,

Y . Kim and M. J. Kochenderfer, “Improving Aircraft Collision Risk Estimation Using the Cross-Entropy Method,”Journal of Air Trans- portation, vol. 24, no. 2, pp. 55–62, 2016

work page 2016
[18]

Failure Proba- bility Estimation for Black-Box Autonomous Systems using State- Dependent Importance Sampling Proposals,

H. Delecki, S. M. Katz, and M. J. Kochenderfer, “Failure Proba- bility Estimation for Black-Box Autonomous Systems using State- Dependent Importance Sampling Proposals,” in11th International Conference on Control, Decision and Information Technologies, vol. 1, 2025, pp. 1–7

work page 2025
[19]

A theory of the learnable,

L. G. Valiant, “A theory of the learnable,”Commun. ACM, vol. 27, no. 11, p. 1134–1142, Nov. 1984

work page 1984
[20]

Randomized Strategies for Probabilistic Solutions of Uncertain Feasibility and Optimization Problems,

T. Alamo, R. Tempo, and E. F. Camacho, “Randomized Strategies for Probabilistic Solutions of Uncertain Feasibility and Optimization Problems,”IEEE Transactions on Automatic Control, vol. 54, no. 11, pp. 2545–2559, 2009

work page 2009
[21]

Tutorial on Practical Prediction Theory for Classifica- tion,

J. Langford, “Tutorial on Practical Prediction Theory for Classifica- tion,”Journal of Machine Learning Research, vol. 6, pp. 273–306, 2005

work page 2005
[22]

Tempo, G

R. Tempo, G. Calafiore, and F. Dabbene,Randomized Algorithms for Analysis and Control of Uncertain Systems: With Applications, 2nd ed. Springer Publishing Company, 2012

work page 2012
[23]

M. C. Campi and S. Garatti,Introduction to the Scenario Approach. Society for Industrial and Applied Mathematics, 2018

work page 2018
[24]

Probability Inequalities for Sums of Bounded Random Variables,

W. Hoeffding, “Probability Inequalities for Sums of Bounded Random Variables,”Journal of the American Statistical Association, vol. 58, no. 301, pp. 13–30, 1963

work page 1963
[25]

Probability Inequalities for the Sum of Independent Random Variables,

G. Bennett, “Probability Inequalities for the Sum of Independent Random Variables,”Journal of the American Statistical Association, vol. 57, no. 297, pp. 33–45, 1962

work page 1962
[26]

Empirical Bernstein Bounds and Sample Variance Penalization

A. Maurer and M. Pontil, “Empirical Bernstein bounds and sample variance penalization,”arXiv:0907.3740, 2009

work page internal anchor Pith review Pith/arXiv arXiv 2009
[27]

A. B. Owen,Monte Carlo Theory, Methods and Examples. Stanford University, 2013, https://artowen.su.domains/mc/

work page 2013
[28]

Importance sampling: a review,

S. T. Tokdar and R. E. Kass, “Importance sampling: a review,”WIREs Computational Statistics, vol. 2, no. 1, pp. 54–60, 2010

work page 2010
[29]

Weighted Average Importance Sampling and Defensive Mixture Distributions,

T. Hesterberg, “Weighted Average Importance Sampling and Defensive Mixture Distributions,”Technometrics, vol. 37, pp. 185–194, 1995

work page 1995
[30]

Safe and Effective Importance Sam- pling,

A. Owen and Y . Z. Associate, “Safe and Effective Importance Sam- pling,”Journal of the American Statistical Association, vol. 95, no. 449, pp. 135–143, 2000

work page 2000
[31]

Monte Carlo and quasi-Monte Carlo methods,

R. E. Caflisch, “Monte Carlo and quasi-Monte Carlo methods,”Acta Numerica, vol. 7, p. 1–49, 1998

work page 1998
[32]

Data-driven reachable set computation using adaptive Gaussian process classification and Monte Carlo meth- ods,

A. Devonport and M. Arcak, “Data-driven reachable set computation using adaptive Gaussian process classification and Monte Carlo meth- ods,” inAmerican Control Conference. IEEE, 2020, pp. 2629–2634

work page 2020
[33]

Meyer, A

P.-J. Meyer, A. Devonport, and M. Arcak,Interval Reachability Analysis: Bounding Trajectories of Uncertain Systems with Boxes for Control and Verification, 2021

work page 2021

[1] [1]

Set invariance in control,

F. Blanchini, “Set invariance in control,”Automatica, vol. 35, no. 11, pp. 1747–1767, 1999

work page 1999

[2] [2]

Safety verification of hybrid systems us- ing barrier certificates,

S. Prajna and A. Jadbabaie, “Safety verification of hybrid systems us- ing barrier certificates,” inInternational workshop on hybrid systems: Computation and control, 2004, pp. 477–492

work page 2004

[3] [3]

A time-dependent Hamilton–Jacobi formulation of reachable sets for continuous dynamic games,

I. M. Mitchell, A. M. Bayen, and C. J. Tomlin, “A time-dependent Hamilton–Jacobi formulation of reachable sets for continuous dynamic games,”IEEE Transactions on Automatic Control, vol. 50, no. 7, pp. 947–957, 2005

work page 2005

[4] [4]

A. B. Kurzhanski and P. Varaiya,Ellipsoidal Techniques for Reacha- bility Analysis. Springer, 2006

work page 2006

[5] [5]

Aubin,Viability Theory, 2nd ed

J.-P. Aubin,Viability Theory, 2nd ed. Boston: Birkh ¨auser, 2009

work page 2009

[6] [6]

Tabuada,Verification and Control of Hybrid Systems: A Symbolic Approach

P. Tabuada,Verification and Control of Hybrid Systems: A Symbolic Approach. Springer, 2009

work page 2009

[7] [7]

Belta, B

C. Belta, B. Yordanov, and E. Gol,Formal Methods for Discrete-Time Dynamical Systems. Springer, 2017

work page 2017

[8] [8]

Funnel libraries for real-time robust feedback motion planning,

A. Majumdar and R. Tedrake, “Funnel libraries for real-time robust feedback motion planning,”The International Journal of Robotics Research, vol. 36, no. 8, pp. 947–982, 2017

work page 2017

[9] [9]

Simulation-Based Approaches for Verification of Embedded Control Systems: An Overview of Traditional and Advanced Modeling, Test- ing, and Verification Techniques,

J. Kapinski, J. V . Deshmukh, X. Jin, H. Ito, and K. R. Butts, “Simulation-Based Approaches for Verification of Embedded Control Systems: An Overview of Traditional and Advanced Modeling, Test- ing, and Verification Techniques,”IEEE Control Systems Magazine, vol. 36, no. 6, pp. 45–64, 2016

work page 2016

[10] [10]

Testing Cyber-Physical Systems through Bayesian Optimization,

J. Deshmukh, M. Horvat, X. Jin, R. Majumdar, and V . S. Prabhu, “Testing Cyber-Physical Systems through Bayesian Optimization,” ACM Trans. Embed. Comput. Syst., vol. 16, no. 5s, 2017

work page 2017

[11] [11]

Falsification-Based Robust Ad- versarial Reinforcement Learning,

X. Wang, S. Nair, and M. Althoff, “Falsification-Based Robust Ad- versarial Reinforcement Learning,” in19th IEEE International Con- ference on Machine Learning and Applications, 2020, pp. 205–212

work page 2020

[12] [12]

Falsification via Barrier Certificates,

V . Murali, A. Trivedi, and M. Zamani, “Falsification via Barrier Certificates,” inAmerican Control Conference, 2024, pp. 4657–4662

work page 2024

[13] [13]

The Probabilistic Model Checking Landscape,

J.-P. Katoen, “The Probabilistic Model Checking Landscape,” in Proceedings of the 31st Annual ACM/IEEE Symposium on Logic in Computer Science, 2016, p. 31–45

work page 2016

[14] [14]

PRISM: A Tool for Automatic Verification of Probabilistic Systems,

A. Hinton, M. Kwiatkowska, G. Norman, and D. Parker, “PRISM: A Tool for Automatic Verification of Probabilistic Systems,” inTools and Algorithms for the Construction and Analysis of Systems. Springer, 2006, pp. 441–444

work page 2006

[15] [15]

A Survey of Algorithms for Black-Box Safety Validation of Cyber- Physical Systems,

A. Corso, R. J. Moss, M. Koren, R. Lee, and M. J. Kochenderfer, “A Survey of Algorithms for Black-Box Safety Validation of Cyber- Physical Systems,”Journal of Artificial Intelligence Research, vol. 72, pp. 377–428, 2021

work page 2021

[16] [16]

Scalable End-to-End Autonomous Vehicle Testing via Rare-event Simulation,

M. O’Kelly, A. Sinha, H. Namkoong, R. Tedrake, and J. C. Duchi, “Scalable End-to-End Autonomous Vehicle Testing via Rare-event Simulation,” inAdvances in Neural Information Processing Systems, vol. 31, 2018

work page 2018

[17] [17]

Improving Aircraft Collision Risk Estimation Using the Cross-Entropy Method,

Y . Kim and M. J. Kochenderfer, “Improving Aircraft Collision Risk Estimation Using the Cross-Entropy Method,”Journal of Air Trans- portation, vol. 24, no. 2, pp. 55–62, 2016

work page 2016

[18] [18]

Failure Proba- bility Estimation for Black-Box Autonomous Systems using State- Dependent Importance Sampling Proposals,

H. Delecki, S. M. Katz, and M. J. Kochenderfer, “Failure Proba- bility Estimation for Black-Box Autonomous Systems using State- Dependent Importance Sampling Proposals,” in11th International Conference on Control, Decision and Information Technologies, vol. 1, 2025, pp. 1–7

work page 2025

[19] [19]

A theory of the learnable,

L. G. Valiant, “A theory of the learnable,”Commun. ACM, vol. 27, no. 11, p. 1134–1142, Nov. 1984

work page 1984

[20] [20]

Randomized Strategies for Probabilistic Solutions of Uncertain Feasibility and Optimization Problems,

T. Alamo, R. Tempo, and E. F. Camacho, “Randomized Strategies for Probabilistic Solutions of Uncertain Feasibility and Optimization Problems,”IEEE Transactions on Automatic Control, vol. 54, no. 11, pp. 2545–2559, 2009

work page 2009

[21] [21]

Tutorial on Practical Prediction Theory for Classifica- tion,

J. Langford, “Tutorial on Practical Prediction Theory for Classifica- tion,”Journal of Machine Learning Research, vol. 6, pp. 273–306, 2005

work page 2005

[22] [22]

Tempo, G

R. Tempo, G. Calafiore, and F. Dabbene,Randomized Algorithms for Analysis and Control of Uncertain Systems: With Applications, 2nd ed. Springer Publishing Company, 2012

work page 2012

[23] [23]

M. C. Campi and S. Garatti,Introduction to the Scenario Approach. Society for Industrial and Applied Mathematics, 2018

work page 2018

[24] [24]

Probability Inequalities for Sums of Bounded Random Variables,

W. Hoeffding, “Probability Inequalities for Sums of Bounded Random Variables,”Journal of the American Statistical Association, vol. 58, no. 301, pp. 13–30, 1963

work page 1963

[25] [25]

Probability Inequalities for the Sum of Independent Random Variables,

G. Bennett, “Probability Inequalities for the Sum of Independent Random Variables,”Journal of the American Statistical Association, vol. 57, no. 297, pp. 33–45, 1962

work page 1962

[26] [26]

Empirical Bernstein Bounds and Sample Variance Penalization

A. Maurer and M. Pontil, “Empirical Bernstein bounds and sample variance penalization,”arXiv:0907.3740, 2009

work page internal anchor Pith review Pith/arXiv arXiv 2009

[27] [27]

A. B. Owen,Monte Carlo Theory, Methods and Examples. Stanford University, 2013, https://artowen.su.domains/mc/

work page 2013

[28] [28]

Importance sampling: a review,

S. T. Tokdar and R. E. Kass, “Importance sampling: a review,”WIREs Computational Statistics, vol. 2, no. 1, pp. 54–60, 2010

work page 2010

[29] [29]

Weighted Average Importance Sampling and Defensive Mixture Distributions,

T. Hesterberg, “Weighted Average Importance Sampling and Defensive Mixture Distributions,”Technometrics, vol. 37, pp. 185–194, 1995

work page 1995

[30] [30]

Safe and Effective Importance Sam- pling,

A. Owen and Y . Z. Associate, “Safe and Effective Importance Sam- pling,”Journal of the American Statistical Association, vol. 95, no. 449, pp. 135–143, 2000

work page 2000

[31] [31]

Monte Carlo and quasi-Monte Carlo methods,

R. E. Caflisch, “Monte Carlo and quasi-Monte Carlo methods,”Acta Numerica, vol. 7, p. 1–49, 1998

work page 1998

[32] [32]

Data-driven reachable set computation using adaptive Gaussian process classification and Monte Carlo meth- ods,

A. Devonport and M. Arcak, “Data-driven reachable set computation using adaptive Gaussian process classification and Monte Carlo meth- ods,” inAmerican Control Conference. IEEE, 2020, pp. 2629–2634

work page 2020

[33] [33]

Meyer, A

P.-J. Meyer, A. Devonport, and M. Arcak,Interval Reachability Analysis: Bounding Trajectories of Uncertain Systems with Boxes for Control and Verification, 2021

work page 2021