pith. sign in

arxiv: 1907.01781 · v1 · pith:DM7W7TSOnew · submitted 2019-07-03 · 🧮 math.ST · stat.AP· stat.TH

Estimating a probability of failure with the convex order in computer experiments

Pith reviewed 2026-05-25 09:50 UTC · model grok-4.3

classification 🧮 math.ST stat.APstat.TH
keywords failure probabilitykrigingconvex ordercomputer experimentsstepwise uncertainty reductionblack-box modelssequential design
0
0 comments X

The pith

A convex-order inequality between two bias-equivalent Kriging estimators ranks their efficiency for black-box failure probability estimation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper addresses estimation of the probability that an expensive black-box physical model exceeds a failure threshold when its inputs are random. Direct application of Kriging produces one estimator whose practical use is limited, so an alternative estimator with matching bias is introduced. The central result establishes that the two estimators are ordered by convex order. This ordering supplies a direct comparison of their efficiency and supplies bounds on the uncertainty attached to each estimate. The same ordering produces a sequential rule for adding new evaluation points that reduces uncertainty according to the Stepwise Uncertainty Reduction principle.

Core claim

There exists a convex order inequality between the Kriging-based estimator of the failure probability and the proposed alternative estimator; the inequality can be used both to compare efficiency and to quantify uncertainty, and it yields a sequential design procedure for computer experiments.

What carries the argument

The convex order relation between the two estimators of the failure probability; it ranks them by the expected value of any convex function applied to the estimator.

If this is right

  • The alternative estimator has lower or equal variance for any convex loss, hence is more efficient under the same bias.
  • Bounds on the variability of the estimated failure probability follow directly from the convex-order relation.
  • A sequential design algorithm can be constructed that selects the next simulation point to shrink the convex-order gap.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same ordering technique may apply to other threshold functionals of Gaussian processes, such as expected excursion volume.
  • On low-dimensional analytic test cases the inequality can be verified by direct Monte Carlo replication of both estimators.

Load-bearing premise

The alternative estimator matches the Kriging estimator in bias and the convex-order comparison extends to the failure-probability functional.

What would settle it

A numerical check on an analytic test function that records whether one estimator consistently shows smaller variance than the other while their means remain equal.

Figures

Figures reproduced from arXiv: 1907.01781 by Lucie Bernard (IDP), Philippe Leduc (ST-TOURS).

Figure 1
Figure 1. Figure 1: The dashed black line is the function mn and gray areas are 95%-confidence intervals written as follows: [mn(x) − 1.96σn(x) ; mn(x) + 1.96σn(x)] , ∀x ∈ X. (5) 2.2 Bayesian approach In order to build an estimator of p, Bayesian principles of Kriging can be used as follows. Under the assumption that g is a trajectory of a Gaussian process ξ, the probability p is a realization of the random variable S ∈ [0, 1… view at source ↗
Figure 1
Figure 1. Figure 1: Illustration of SUR strategies based on criteria J4,n (left) and JRn (right). Top: First iteration. Bottom: Last iteration. Function g (black plain line); threshold T (red dashed line); initial experiments (squares); the new ones (circles); mean function mn of the Kriging model (dashed curve); 95% confidence intervals given in Equation (5) (shaded area); density of PX (red plain line). 14 [PITH_FULL_IMAGE… view at source ↗
Figure 2
Figure 2. Figure 2: Illustration of performances of SUR strategies based on criteria J4,n (left) and JRn (right). True failure probability p = 4.643 · 10−2 (red dashed line) ; successive estimates of p corresponding to the estimator (9) (circles); successive estimates of a credible interval at level 95% given in Equation (20), with β = 1 2 (shaded area). Estimation of p 95% credible lower bound 95% credible upper bound J4,n 4… view at source ↗
Figure 3
Figure 3. Figure 3: Sensitivity study to sample size of the Monte Carlo method. Left & right: True probability p = 4.643 · 10−2 (red dashed line). Left: Boxplots for the estimation of p obtained with 100 Monte Carlo simulations. Right: Boxplots for the estimation of a credible interval at level 95% obtained with the same 100 Monte Carlo simulations. 7.2 An industrial case study 7.2.1 Description of the real case The estimatio… view at source ↗
Figure 4
Figure 4. Figure 4: Successive estimations of of the failure probability p (horizontal red dashed line); 95%-confidence intervals with N = 1000 simulations (red area); estimations of p (blue points) and 95%-credible intervals (gray area) with n = 50, . . . , 200 simulations [PITH_FULL_IMAGE:figures/full_fig_p019_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Successive estimations, with consideration of the derivative (left) and without consideration of the derivative (right), of the failure probability p (horizontal red dashed line); 95 %-confidence intervals with N = 1000 simulations (red area); estimations of p (blue points) and 95%-credible intervals (gray area) with n = 50, . . . , 200 simulations. set the values of the hyper-parameters as well as to inve… view at source ↗
read the original abstract

This paper deals with the estimation of a failure probability of an industrial product. To be more specific, it is defined as the probability that the output of a physical model, with random input variables, exceeds a threshold. The model corresponds with an expensive to evaluate black-box function, so that classical Monte Carlo simulation methods cannot be applied. Bayesian principles of the Kriging method are then used to design an estimator of the failure probability. From a numerical point of view, the practical use of this estimator is restricted. An alternative estimator is proposed, which is equivalent in term of bias. The main result of this paper concerns the existence of a convex order inequality between these two estimators. This inequality allows to compare their efficiency and to quantify the uncertainty on the results that these estimators provide. A sequential procedure for the construction of a design of computer experiments, based on the principle of the Stepwise Uncertainty Reduction strategies, also results of the convex order inequality. The interest of this approach is highlighted through the study of a real case from the company STMicroelectronics.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The paper defines a Kriging-based estimator of failure probability for an expensive black-box simulator and proposes an alternative estimator shown to have identical bias. The central result is a convex-order inequality between the two random variables, proved using properties of the Gaussian-process posterior. This inequality is used to rank estimator efficiency, bound uncertainty, and derive a Stepwise Uncertainty Reduction (SUR) sequential design. The approach is illustrated on an industrial case from STMicroelectronics.

Significance. If the convex-order result holds, the paper supplies a rigorous, assumption-light tool for comparing the two estimators and for driving adaptive designs in rare-event estimation. The explicit bias calculation and the use of convex order on the failure-probability functional constitute a clear technical contribution; the manuscript also provides the construction and proof, which are positive features.

minor comments (3)
  1. The notation for the two estimators (Kriging-based and alternative) should be introduced with explicit formulas in §2 or §3 so that the bias-equivalence statement can be checked without back-referencing the abstract.
  2. Figure captions for the industrial example should state the dimension of the input space and the number of design points used, to allow direct comparison with other SUR strategies.
  3. A short remark on the computational cost of evaluating the convex-order bound itself would help readers assess practicality for larger designs.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary, the recognition of the technical contribution of the convex-order result, and the recommendation of minor revision. We are pleased that the bias calculation, the use of convex order on the failure-probability functional, and the resulting SUR design are viewed favorably.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The manuscript supplies explicit constructions of both the Kriging-based estimator and the alternative estimator, derives their bias equivalence from the Gaussian process posterior, and proves the convex-order inequality directly from properties of that posterior. These steps constitute an internally supported derivation chain with no reduction to fitted inputs, self-citations, or ansatzes that would render the central claim equivalent to its own premises by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available; no explicit free parameters, axioms, or invented entities can be identified.

pith-pipeline@v0.9.0 · 5715 in / 1098 out tokens · 29784 ms · 2026-05-25T09:50:18.843439+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

49 extracted references · 49 canonical work pages

  1. [1]

    Abrahamsen

    P. Abrahamsen. A review of Gaussian random fields and correlation functions-2nd edition. Norwegian Computing Center , 1997

  2. [2]

    Auffray, P

    Y. Auffray, P. Barbillon, and J.-M. Marin. Bounding rare event probabilities in computer experiments. Computational Statistics and Data Analysis , 80:153–166, 2014

  3. [3]

    Azzimonti

    D. Azzimonti. Contributions to Bayesian set estimation relying on random field priors. PhD thesis, University of Bern, 2016

  4. [4]

    Azzimonti, J

    D. Azzimonti, J. Bect, C. Chevalier, and D. Ginsbourger. Quantifying uncertainties on excursion sets under a Gaussian random field prior. SIAM/ASA Journal of Uncertainty Quantification, 4(1):850–874, 2016

  5. [5]

    B¨ auerle and A

    N. B¨ auerle and A. M¨ uller. Stochastic orders and risk measures: consistency and bounds. Insurance Mathematics and Economics , 38(1):132:148, 2006

  6. [6]

    J. Bect, F. Bachoc, and D. Ginsbourger. A supermartingale approach to Gaussian process based sequential design of experiments. Bernoulli, 2019. 26

  7. [7]

    J. Bect, D. Ginsbourger, L. Li, V. Picheny, and E. Vazquez. Sequential design of computer experiments for the estimation of a probability of failure. Statistics and Computing, 22(3):773–793, 2012

  8. [8]

    L. Bernard. M´ ethodes probabilistes pour l’estimation de probabilit´ es de d´ efaillance. PhD thesis, Universit´ e de Tours, 2019

  9. [9]

    B. J. Bichon, M. S. Eldred, L. P. Swiler, S. Mahadevan, and J. M. McFarland. Efficient global reliability analysis for nonlinear implicit performance functions. AIAA Journal, 46:2459–2468, 2008

  10. [10]

    Boutsikas and E

    M. Boutsikas and E. Vaggelatou. On the distance between convex-ordered random variables, with applications. Advances in Applied Probability, 34:349–374, 2002

  11. [11]

    Brazauskas, B

    V. Brazauskas, B. L. Jones, M. L. Puri, and R. Zitikis. Estimating conditional tail expectation with acturial applications in view. Journal of Statistical Planning and Inference, 138:3590–3604, 2008

  12. [12]

    Chevalier

    C. Chevalier. Fast uncertainty reduction strategies relying on Gaussian process models. PhD thesis, University of Bern, 2013

  13. [13]

    Chevalier, J

    C. Chevalier, J. Bect, D. Ginsbourger, E. Vazquez, V. Picheny, and Y. Richet. Fast parallel kriging-based stepwise uncertainty reduction with application to the identification of an excursion set. Technometrics, 56(4):455–465, 2014

  14. [14]

    Chevalier, V

    C. Chevalier, V. Picheny, and D. Ginsbourger. KrigInv: An efficient and user-friendly R implementation of Kriging-based inversion algorithms. Computational Statistics & Data Analysis , 71:1021–1034, 2014

  15. [15]

    Chil` es and P

    J. Chil` es and P. Delfiner.Geostatistis: modeling spatial uncertainty, volume 2. Wiley series in probability and statistics, 1999

  16. [16]

    Choudhry

    M. Choudhry. An introduction to Value at Risk . Wiley, 2013

  17. [17]

    M. Davis. Consistency of risk measure estimates. 2013

  18. [18]

    Dette and A

    H. Dette and A. Pepelyshev. Generalized latin hypercube design for computer experiments. Technometrics, 52(4):421–429, 2010

  19. [19]

    Dhaene, M

    J. Dhaene, M. Denuit, M. Goovaerts, R. Kaas, and D. Vyncke. The concept of comonotonicity in actuarial science and finance: theory. Insurance: Mathematics & Economics, 31:3–33, 2002

  20. [20]

    Diggle and P.-J

    P. Diggle and P.-J. Ribeiro. Model-based geostatistics. Springer Series in Statistics. 2007

  21. [21]

    Dubourg, F

    V. Dubourg, F. Deheeger, and B. Sudret. Metamodel-based importance sampling for structural reliability analysis. Probabilistic Engineering Mechanics, 33:47–57, 2013

  22. [22]

    Echard, N

    B. Echard, N. Gayton, and M. Lemaire. AK-MCS: an active learning reliability method combining Kriging and Monte Carlo Simulation. Structural Safety, 33:145– 154, 2011

  23. [23]

    M. R. El Amri, C. Helbert, O. Lepreux, M. Munoz Zuniga, C. Prieur, and D. Sinoquet. Data-driven stochastic inversion under functional uncertainties. working paper or preprint, Feb. 2018

  24. [24]

    M. J. Shervish. Theory of statistics . Springer, 2010

  25. [25]

    R. Kaas, J. Dhaene, D. Vyncke, M. Goovaerts, and M. Denuit. A simple geometric proof that comonotonic risks have the convex-largest sum. Astin Bulletin , 32:71–80, 2002. 27

  26. [26]

    I. Kaymaz. Application of Kriging method for structural reliability problems. Struct. Safety, 27:133–151, 2005

  27. [27]

    Le Gratiet

    L. Le Gratiet. Multi-fidelity Gaussian process regression for computer experiments . PhD thesis, Universit´ e Paris-Diderot Paris VII, 2013

  28. [28]

    Matheron

    G. Matheron. Principles of geostatistics. Economic Geology, 1963

  29. [29]

    McKay, R

    M. McKay, R. Beckman, and W. Conover. A comparaison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics, 21(2):239–245, 1979

  30. [30]

    Molchanov

    I. Molchanov. Theory of random sets . Springer, 2005

  31. [31]

    M¨ uller and D

    A. M¨ uller and D. Stoyan.Comparison Methods for Stochastic Models and Risks. Wiley Series in Probability and Statistics. Hohn Wiley & Sons Ltd, Chichester, 2002

  32. [32]

    J. Oger, P. Leduc, and E. Lesigne. A random field model and decision support in industrial production. J. SFdS, 156(3):1–26, 2015

  33. [33]

    Picheny, D

    V. Picheny, D. Ginsbourger, O. Roustant, R. Haftka, and N. Kim. Adaptive designs of experiments for accurate approximation of a target region. Journal of Mechanical Design, 132, 2010

  34. [34]

    Pitera and T

    M. Pitera and T. Schmidt. Unbiased estimation of risk. Journal of Banking and Finance, 91, 2018

  35. [35]

    C. E. Rasmussen and C. K. I. Williams. Gaussian processes for machine learning . Adaptive Computation and Machine Learning. MIT Press, Cambridge, MA, 2006

  36. [36]

    C. Robert. The Bayesian Choice . Springer, 2007

  37. [37]

    Roustant, D

    O. Roustant, D. Ginsbourger, and Y. Deville. DiceKriging, DiceOptim: Two R packages for the analysis of computer experiments by Kriging-based metamodelling and optimization. Journal of Statistical Software , 51, 2012

  38. [38]

    Ruschendorf

    L. Ruschendorf. Mathematical risk analysis . Springer, Heidelberg, 2013

  39. [39]

    Sacks, T

    J. Sacks, T. J. Mitchell, W. J. Welch, and H. P. Wynn. Design and analysis of computer experiments. Statistical Science, 4(4):409–435, 1989

  40. [40]

    Sacks, B

    J. Sacks, B. S. Schiller, and W. J. Welch. Designs for computer experiments. Technometrics, 31(1), 1989

  41. [41]

    T. J. Santner, B. J. Williams, and W. I. Notz. The design and analysis of computer experiments. Springer Science & Business Media. 2003

  42. [42]

    Shaked and J

    M. Shaked and J. Shanthikumar. Stochastic orders. Springer Series in Statistics. 2007

  43. [43]

    Solak, R

    E. Solak, R. Murray-Smith, W. E. Leithead, D. J. Leith, and C. Rasmussen. Derivative observations in gaussian process models of dynamic systems. Advances in Neural Information Processing Systems 15 , 15:1057–1064, 2003

  44. [44]

    M. Stein. Large sample properties of simulations using Latin hypercube sampling. Technometrics, 29:143–151, 1987

  45. [45]

    M. L. Stein. Interpolation of spatial data . Springer Series in Statistics. 1999

  46. [46]

    D. Tasche. Expected shortall and beyond. Journal of Banking and Finance , 26:1519– 1533, 2002

  47. [47]

    E. Vestrup. The Theory of Measures and Integration . Wiley, 2003

  48. [48]

    Wackernagel

    H. Wackernagel. Multivariate Geostatistics. Springer-Verlag, Berlin, 2003

  49. [49]

    A. Wu, M. Aoi, and J. Pillow. Exploiting gradients and Hessians in Bayesian optimization and Bayesian quadrature. 2018. 28