pith. sign in

arxiv: 2605.21416 · v1 · pith:YQD3FTC7new · submitted 2026-05-20 · 🧮 math.ST · stat.ME· stat.TH

Data driven extreme value distribution estimation: Derivation of the Mean Integrated Squared Error, optimal bandwidth selection and stability conditions

Pith reviewed 2026-05-21 02:44 UTC · model grok-4.3

classification 🧮 math.ST stat.MEstat.TH
keywords extreme value distributionkernel density estimationmean integrated squared errorbandwidth selectionstability conditionsdata-driven estimationnonparametric statistics
0
0 comments X

The pith

The DDEVD kernel estimator supplies an explicit mean integrated squared error formula that determines the optimal bandwidth and its stability for extreme value distribution fitting.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents the data-driven extreme value distribution estimator as a kernel method for approximating the distribution of extremes directly from samples. It works through a step-by-step derivation of the estimator's mean integrated squared error, treating the kernel bandwidth as the adjustable parameter. This expression is then minimized to locate the best bandwidth value. The same derivation produces conditions that keep the optimal bandwidth from changing abruptly when the data are slightly perturbed. A reader cares because reliable tail estimation matters for risk assessment in insurance, finance, and climate records, where arbitrary bandwidth choices have long been a practical obstacle.

Core claim

The DDEVD estimator is a kernel smoother for extreme value distributions. Its mean integrated squared error admits an explicit expression in terms of the bandwidth. Minimizing that expression supplies the optimal bandwidth. The same expression further yields analytic conditions that guarantee the bandwidth choice remains stable under small data perturbations.

What carries the argument

The mean integrated squared error of the kernel estimator, which is derived in closed form and then minimized with respect to bandwidth while checking stability of the resulting optimum.

If this is right

  • The optimal bandwidth follows from a direct formula rather than iterative search.
  • The stability conditions bound how much the chosen bandwidth can jump when one or two data points are added or removed.
  • The estimator can be deployed on moderate-sized data sets without requiring heavy numerical tuning.
  • Practitioners obtain a single, theoretically justified smoothing parameter for tail modeling.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same MISE derivation might be adapted to other kernel-based tail estimators that currently rely on cross-validation.
  • Stability conditions could be used to construct confidence bands around the fitted extreme value curve.
  • The approach suggests a route to fully automatic bandwidth selection in nonparametric extreme value statistics.

Load-bearing premise

The mean integrated squared error of the kernel estimator applied to extreme value distributions takes a form that can be written out explicitly enough to allow direct minimization and stability analysis.

What would settle it

A Monte Carlo experiment in which the bandwidth that minimizes the derived MISE expression produces a larger true integrated squared error than a nearby bandwidth chosen by cross-validation on fresh extreme value samples.

Figures

Figures reproduced from arXiv: 2605.21416 by Michael Sandbichler, Tobias Hell.

Figure 1
Figure 1. Figure 1: Stability phase diagram for the DDEVD bandwidth optimisation for different dis [PITH_FULL_IMAGE:figures/full_fig_p015_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Stability phase diagram for the DDEVD bandwidth optimisation for different distri [PITH_FULL_IMAGE:figures/full_fig_p016_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: MISE of the DDEVD estimator when the base distribution [PITH_FULL_IMAGE:figures/full_fig_p016_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Stability of the DDEVD bandwidth optimisation under varying block sizes. With a [PITH_FULL_IMAGE:figures/full_fig_p017_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: MISE of the DDEVD estimator with the analytically derived optimal bandwidth [PITH_FULL_IMAGE:figures/full_fig_p017_5.png] view at source ↗
read the original abstract

We introduce the data driven extreme value distribution (DDEVD) estimator, a kernel-based method for estimating extreme value distributions from data. We derive its mean integrated squared error (MISE) in detail, use it to compute the optimal bandwidth and establish stability conditions for the bandwidth optimization procedure.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript introduces the data-driven extreme value distribution (DDEVD) estimator, a kernel density estimator tailored to extreme value distributions. It derives the mean integrated squared error (MISE) expression in detail, employs the MISE to obtain an optimal bandwidth, and derives stability conditions for the resulting bandwidth optimization procedure.

Significance. If the MISE derivation and stability analysis hold under the regularity conditions appropriate to extreme-value distributions, the work would supply a theoretically grounded, data-driven bandwidth selector for nonparametric tail estimation. This addresses a practical gap in extreme-value applications where ad-hoc bandwidth choices are common and could improve finite-sample performance in risk modeling and tail inference.

major comments (3)
  1. [Section 3 (MISE derivation)] The MISE derivation (Section 3) proceeds under the classical KDE assumptions of twice continuous differentiability of the target density and integrability of (f'')². For the Fréchet and GPD families with shape parameter ξ ≥ 1/2 these conditions fail at infinity or at the lower endpoint; the bias integral therefore diverges and the claimed closed-form MISE is not obtained. This step is load-bearing for both the optimal-bandwidth formula and the subsequent stability analysis.
  2. [Eq. (12)] The optimal bandwidth expression (Eq. (12)) is obtained by minimizing the derived MISE. Because the MISE itself is not finite under the heavy-tail regimes typical of extreme-value data, the resulting bandwidth formula is not guaranteed to be well-defined or asymptotically optimal for the distributions the estimator is intended to target.
  3. [Section 5 (stability conditions)] The stability conditions (Section 5) are stated for the bandwidth optimization map. Without a verified finite MISE, the contraction-mapping or Lipschitz arguments used to establish stability rest on an unverified premise and require re-derivation under the weaker regularity conditions that actually hold for extreme-value distributions.
minor comments (2)
  1. [Section 2] Notation for the kernel and the extreme-value index is introduced without a consolidated table; a short notation summary would improve readability.
  2. [Table 2] The simulation study reports only point estimates of MISE; standard errors or bootstrap variability measures should be added to Table 2 to allow assessment of the stability claims.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the careful and constructive review of our manuscript. The comments highlight important considerations regarding the applicability of classical kernel density estimation assumptions to extreme value distributions. We respond to each major comment below and indicate the revisions planned for the next version of the manuscript.

read point-by-point responses
  1. Referee: [Section 3 (MISE derivation)] The MISE derivation (Section 3) proceeds under the classical KDE assumptions of twice continuous differentiability of the target density and integrability of (f'')². For the Fréchet and GPD families with shape parameter ξ ≥ 1/2 these conditions fail at infinity or at the lower endpoint; the bias integral therefore diverges and the claimed closed-form MISE is not obtained. This step is load-bearing for both the optimal-bandwidth formula and the subsequent stability analysis.

    Authors: We acknowledge that the twice continuous differentiability and integrability of (f'')² do not hold for all members of the Fréchet and GPD families, particularly when ξ ≥ 1/2. The manuscript will be revised to state explicitly the regularity conditions under which the MISE derivation is valid (namely ξ < 1/2) and to note the divergence of the bias term outside this range. This clarification will be added to Section 3 without altering the algebraic steps already presented. revision: yes

  2. Referee: [Eq. (12)] The optimal bandwidth expression (Eq. (12)) is obtained by minimizing the derived MISE. Because the MISE itself is not finite under the heavy-tail regimes typical of extreme-value data, the resulting bandwidth formula is not guaranteed to be well-defined or asymptotically optimal for the distributions the estimator is intended to target.

    Authors: We agree that the closed-form optimal bandwidth in Eq. (12) presupposes a finite MISE. In the revision we will restrict the claim of asymptotic optimality to the regime ξ < 1/2 where the MISE expression is valid, and we will add a brief discussion of the formula’s behavior and possible practical use as an approximation when ξ ≥ 1/2. No change to the algebraic derivation of Eq. (12) itself is required. revision: yes

  3. Referee: [Section 5 (stability conditions)] The stability conditions (Section 5) are stated for the bandwidth optimization map. Without a verified finite MISE, the contraction-mapping or Lipschitz arguments used to establish stability rest on an unverified premise and require re-derivation under the weaker regularity conditions that actually hold for extreme-value distributions.

    Authors: The stability arguments in Section 5 rely on the MISE being finite. We will revise this section to make the dependence on the finite-MISE regime explicit and to limit the contraction-mapping claim accordingly. A full re-derivation under weaker tail conditions is beyond the scope of the present work; we therefore mark this point as a partial revision and will flag the restriction in the text. revision: partial

Circularity Check

0 steps flagged

MISE derivation and bandwidth optimization follow standard nonparametric procedure without reduction to inputs

full rationale

The paper derives the mean integrated squared error for its kernel DDEVD estimator in detail, then minimizes that expression to obtain an optimal bandwidth and checks stability conditions. This is the conventional analytic route in kernel density estimation and does not reduce by construction to a fitted parameter renamed as a prediction, a self-definitional loop, or a load-bearing self-citation. No equations or sections in the abstract or description exhibit the target result being presupposed in the derivation; the steps remain independent of the final bandwidth value. The analysis is therefore self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no explicit free parameters, axioms, or invented entities; the kernel bandwidth is optimized rather than treated as a free parameter, and standard kernel and extreme-value assumptions are invoked implicitly.

pith-pipeline@v0.9.0 · 5571 in / 1117 out tokens · 50070 ms · 2026-05-21T02:44:14.423482+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

  • Cost.FunctionalEquation washburn_uniqueness_aczel unclear
    ?
    unclear

    Relation between the paper passage and the cited Recognition theorem.

    Theorem 2.1 (Asymptotic Expansion of the Mean Squared Error) ... MSE(y) = 1/m² [∑ b0,i]² + ... + O(h³) with explicit bias coefficients b0,i(y) ≈ FniX(y)(√FX e^{ni/2(1−FX)} − 1) and variance coefficients obtained from central-moment approximations

  • Foundation.DimensionForcing alexander_duality_circle_linking unclear
    ?
    unclear

    Relation between the paper passage and the cited Recognition theorem.

    Theorem 2.3 ... m < C(γ, FX, K) · n^{1+γ/2} for γ > −1/2, derived via Watson's lemma and change-of-variables z = n(1−FX(y)) on heavy-tail asymptotics fX(z) ∼ D z^{γ+1}

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

25 extracted references · 25 canonical work pages

  1. [1]

    A note on the estimation of a distribution function and quantiles by a kernel method

    Adelchi Azzalini. “A note on the estimation of a distribution function and quantiles by a kernel method”. In:Biometrika68.1 (1981), pp. 326–328

  2. [2]

    Elsevier, 2012

    Enrique Castillo.Extreme value theory in engineering. Elsevier, 2012

  3. [3]

    On smooth statistical tail functionals

    Holger Drees. “On smooth statistical tail functionals”. In:Scandinavian Journal of Statis- tics25.1 (1998), pp. 187–210

  4. [4]

    Paul Embrechts, Claudia Kl¨ uppelberg, and Thomas Mikosch.Modelling extremal events: for insurance and finance. Vol. 33. Springer Science & Business Media, 2013

  5. [5]

    On the block maxima method in extreme value theory: PWM estimators

    Ana Ferreira and Laurens De Haan. “On the block maxima method in extreme value theory: PWM estimators”. In:The Annals of Statistics43.1 (2015), pp. 276–298.doi: 10.1214/14-AOS1280

  6. [6]

    Columbia university press, 1958

    Emil Julius Gumbel.Statistics of extremes. Columbia university press, 1958

  7. [7]

    Laurens Haan and Ana Ferreira.Extreme value theory: an introduction. Vol. 3. Springer, 2006

  8. [8]

    A brief survey of bandwidth selection for density estimation

    M Chris Jones, James S Marron, and Simon J Sheather. “A brief survey of bandwidth selection for density estimation”. In:Journal of the American statistical association91.433 (1996), pp. 401–407

  9. [9]

    Statistics of extremes in hy- drology

    Richard W Katz, Marc B Parlange, and Philippe Naveau. “Statistics of extremes in hy- drology”. In:Advances in water resources25.8-12 (2002), pp. 1287–1304. 36

  10. [10]

    A metastatistical approach to rainfall ex- tremes

    Marco Marani and Massimiliano Ignaccolo. “A metastatistical approach to rainfall ex- tremes”. In:Advances in Water Resources79 (2015), pp. 121–126

  11. [11]

    Princeton university press, 2015

    Alexander J McNeil, R¨ udiger Frey, and Paul Embrechts.Quantitative risk management: concepts, techniques and tools-revised edition. Princeton university press, 2015

  12. [12]

    SymPy: symbolic computing in Python

    Aaron Meurer et al. “SymPy: symbolic computing in Python”. In:PeerJ Computer Sci- ence3 (2017), e103

  13. [13]

    Peter David Miller.Applied asymptotic analysis. Vol. 75. American Mathematical Soc., 2006

  14. [14]

    Metastatistical Extreme Value Distribution applied to floods across the continental United States

    Arianna Miniussi, Marco Marani, and Gabriele Villarini. “Metastatistical Extreme Value Distribution applied to floods across the continental United States”. In:Advances in Water Resources136 (2020), p. 103498

  15. [15]

    Some new estimates for distribution functions

    Elizbar A Nadaraya. “Some new estimates for distribution functions”. In:Theory of Prob- ability & Its Applications9.3 (1964), pp. 497–500

  16. [16]

    Comparison of data-driven bandwidth selectors

    B. U. Park and J. S. Marron. “Comparison of data-driven bandwidth selectors”. In:Jour- nal of the American Statistical Association85.409 (1990), pp. 66–72

  17. [17]

    Valentin V Petrov.Sums of independent random variables. Vol. 82. Springer Science & Business Media, 2012

  18. [18]

    Statistical inference using extreme order statistics

    James Pickands III. “Statistical inference using extreme order statistics”. In:the Annals of Statistics(1975), pp. 119–131

  19. [19]

    Multistage plug-in bandwidth selection for kernel distribution function estimates

    Alan M. Polansky and Edsel R. Baker. “Multistage plug-in bandwidth selection for kernel distribution function estimates”. In:Journal of Statistical Computation and Simulation 65.1–4 (2000), pp. 63–80

  20. [20]

    Modeling snow depth extremes in Austria

    Harald Schellander and Tobias Hell. “Modeling snow depth extremes in Austria”. In: Natural Hazards94.3 (2018), pp. 1367–1389

  21. [21]

    Error structure of metastatistical and generalized extreme value distributions for modeling extreme rainfall in Austria

    Harald Schellander, Alexander Lieb, and Tobias Hell. “Error structure of metastatistical and generalized extreme value distributions for modeling extreme rainfall in Austria”. In: Earth and Space Science6.9 (2019), pp. 1616–1632

  22. [22]

    Density estimation

    S. J. Sheather. “Density estimation”. In:Statistical Science19.4 (2004), pp. 588–597

  23. [23]

    A reliable data-based bandwidth selection method for kernel density estimation

    Simon J Sheather and Michael C Jones. “A reliable data-based bandwidth selection method for kernel density estimation”. In:Journal of the Royal Statistical Society: Se- ries B (Methodological)53.3 (1991), pp. 683–690

  24. [24]

    B. W. Silverman.Density Estimation for Statistics and Data Analysis. Routledge, 1986

  25. [25]

    Transformations in density estimation

    M. P. Wand, J. S. Marron, and D. Ruppert. “Transformations in density estimation”. In: Journal of the American Statistical Association86.414 (1991), pp. 343–353. 37