Choosing the threshold in extreme value analysis
Pith reviewed 2026-06-30 01:07 UTC · model grok-4.3
The pith
Simulations identify the strongest performers among more than 40 threshold selection methods for extreme value analysis.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Through a systematic review of the statistical foundations of more than 40 threshold selection procedures—including those based on Hill's estimator, visual diagnostics, goodness-of-fit tests, and extended generalized Pareto models—the paper identifies the most promising methods via an extensive simulation study and applies the comparison to daily rainfall data from Padova.
What carries the argument
Threshold selection procedures that determine the cutoff value above which exceedances are modeled by the generalized Pareto distribution.
If this is right
- Certain procedures can be automated for routine use without requiring manual diagnostics.
- Inference for extremes becomes more stable when the threshold is chosen by one of the top-ranked methods.
- The uncertainty attached to the threshold can be incorporated into subsequent risk calculations more readily.
Where Pith is reading between the lines
- The same simulation framework could be applied to test whether the top methods remain reliable under stronger serial dependence or non-stationary conditions.
- Extension to multivariate extremes would require new criteria because joint threshold choice interacts with dependence modeling.
- Bayesian formulations could treat the threshold as a random variable and compare posterior performance against the frequentist selectors identified here.
Load-bearing premise
The chosen simulation scenarios and the single Padova rainfall series adequately represent the range of real-world data and dependence structures that arise in applications.
What would settle it
A fresh simulation study that employs substantially different data-generating processes and produces a markedly different ranking of the procedures would falsify the identification of the most promising methods.
read the original abstract
One of the two dominant approaches for univariate extreme value analysis is to model exceedances above a large threshold, the choice of which has a large impact on inference and whose uncertainty is often subsequently ignored. In this article we review more than 40 threshold selection procedures, including semiparametric methods based on Hill's estimator, visual diagnostics, goodness-of-fit tests, and others based on extended generalized Pareto models. Starting with the statistical properties underlying the various proposals, we provide a critical assessment of their strengths and weaknesses, discuss how they might be automated and describe the results of an extensive simulation study used to identify the most promising procedures. The approaches are compared using a long time series of daily rainfall totals from Padova.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper reviews more than 40 threshold selection procedures for univariate extreme value analysis (including Hill-based, visual, GoF, and extended GPD methods), critically assesses their statistical properties and automation potential, and reports results from an extensive simulation study to identify the most promising procedures, with comparison also performed on a long daily rainfall series from Padova.
Significance. If the simulation-based ranking holds under representative conditions, the work would offer practitioners concrete guidance on a high-impact modeling choice whose uncertainty is frequently ignored, potentially improving reliability of tail inferences across applications.
major comments (2)
- [Simulation study] Simulation study section: the data-generating processes are not described as including temporal clustering, non-stationarity, or heavy-tail regimes with varying dependence; because the Padova rainfall validation series exhibits these features, the reported ranking of the >40 procedures may not generalize and the central comparative claim rests on an unverified assumption of representativeness.
- [Application to rainfall data] Validation on Padova series: the single real-data example is used to illustrate but no quantitative comparison of method performance (e.g., stability of parameter estimates or out-of-sample predictive scores) is reported against the simulation winners, weakening the link between simulation rankings and practical utility.
minor comments (2)
- The abstract states 'more than 40' methods but the main text should include an explicit table or appendix listing all reviewed procedures with their key references for reproducibility.
- Notation for threshold estimators (e.g., u_n, k) should be standardized across sections to avoid reader confusion when comparing semiparametric and model-based approaches.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive report. We address the two major comments below, agreeing where the critique is valid and outlining targeted revisions. Our responses focus on clarifying the scope of the work while strengthening the manuscript's transparency and practical relevance.
read point-by-point responses
-
Referee: [Simulation study] Simulation study section: the data-generating processes are not described as including temporal clustering, non-stationarity, or heavy-tail regimes with varying dependence; because the Padova rainfall validation series exhibits these features, the reported ranking of the >40 procedures may not generalize and the central comparative claim rests on an unverified assumption of representativeness.
Authors: We agree that the simulation study is conducted under the classical i.i.d. setting with standard GPD tails and does not incorporate temporal clustering, non-stationarity, or dependence structures. This choice was deliberate to isolate the performance of threshold selectors under the asymptotic conditions where most of the reviewed methods are theoretically justified. The Padova series is presented as a separate real-data illustration rather than a direct validation of the simulation rankings. In revision we will (i) explicitly document the simulation assumptions in a dedicated limitations subsection, (ii) add a brief discussion of how dependence and non-stationarity may affect the relative performance of the procedures, and (iii) caution readers that the reported ranking applies most directly to the i.i.d. case. We do not plan to rerun the full simulation suite with dependent or non-stationary DGPs, as that would constitute a substantially different study. revision: partial
-
Referee: [Application to rainfall data] Validation on Padova series: the single real-data example is used to illustrate but no quantitative comparison of method performance (e.g., stability of parameter estimates or out-of-sample predictive scores) is reported against the simulation winners, weakening the link between simulation rankings and practical utility.
Authors: The Padova analysis is intended as an illustrative case study rather than a formal validation exercise, precisely because the true threshold is unknown. We therefore did not compute stability or predictive metrics that would require additional modeling assumptions. We accept that this leaves a gap between the simulation results and practical guidance. In the revised manuscript we will add a short quantitative subsection that reports, for the top-ranked procedures, (a) the variability of the resulting GPD parameter estimates across bootstrap resamples of the series and (b) a simple out-of-sample check using a hold-out period to compare predictive performance of the fitted tails. These additions will be presented as supplementary evidence rather than definitive proof of superiority. revision: yes
Circularity Check
No circularity: review and simulation study with independent empirical basis
full rationale
The paper reviews over 40 existing threshold selection methods for extreme value analysis, provides critical assessment based on their statistical properties, and ranks them via an extensive simulation study plus application to one rainfall series. No mathematical derivation chain exists that reduces predictions or results to inputs by construction, self-definition, or self-citation load-bearing. The simulation outcomes and comparisons are generated independently from the reviewed methods and do not involve fitting parameters then relabeling them as predictions. This is a standard empirical review paper whose claims rest on external simulation design rather than internal circular reduction.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
1928 , journal =
Limiting forms of the frequency distributions of the largest or smallest member of a sample , author =. 1928 , journal =
1928
-
[2]
1943 , journal =
Sur la distribution limite du terme maximum d'une s\'erie al\'eatoire , author =. 1943 , journal =
1943
-
[3]
On the theory of order statistics , author =. 1953 , journal =. doi:10.1007/BF02127580 , isbn =
-
[4]
1954 , journal =
La distribution de la plus grande de n valeurs , author =. 1954 , journal =
1954
-
[5]
1955 , journal =
The frequency distribution of the annual maximum (or minimum) values of meteorological elements , author =. 1955 , journal =
1955
-
[6]
1958 , publisher =
Statistics of Extremes , author =. 1958 , publisher =
1958
-
[7]
A general definition of residuals (with
Cox, David R and Snell, E Joyce , year =. A general definition of residuals (with. Journal of the Royal Statistical Society. Series B , publisher =
-
[8]
A stochastic model for flood analysis , author =. 1970 , journal =. doi:10.1029/WR006i006p01641 , url =
-
[9]
Some problems of flood analysis , author =. 1971 , journal =. doi:10.1029/WR007i005p01144 , url =
-
[10]
Residual life time at great age , author =. 1974 , journal =. doi:10.1214/aop/1176996548 , url =
-
[11]
1975 , journal =
A simple general approach to inference about the tail of a distribution , author =. 1975 , journal =
1975
-
[12]
1975 , journal =
Statistical inference using extreme order statistics , author =. 1975 , journal =
1975
-
[13]
1978 , journal =
Estimation of a biometric function , author =. 1978 , journal =
1978
-
[14]
1978 , journal =
Estimation of parameters and large quantiles based on the k largest observations , author =. 1978 , journal =
1978
-
[15]
Maximum likelihood estimation of misspecified models , author =. 1982 , journal =. doi:10.2307/1912526 , url =
-
[16]
1983 , publisher =
Extremes and Related Properties of Random Sequences and Processes , author =. 1983 , publisher =
1983
-
[17]
Modelling excesses over high thresholds, with an application , author =. 1984 , booktitle =. doi:10.1007/978-94-017-3069-3\_34 , url =
-
[18]
1984 , journal =
Testing whether the shape parameter is zero in the generalized extreme-value distribution , author =. 1984 , journal =
1984
-
[19]
1984 , booktitle =
Threshold methods for sample extremes , author =. 1984 , booktitle =
1984
-
[20]
1985 , journal =
Estimation of the generalized extreme-value distribution by the method of probability-weighted moments , author =. 1985 , journal =
1985
-
[21]
Adaptive estimates of parameters of regular variation , author =. 1985 , journal =. doi:10.1214/aos/1176346596 , url =
-
[22]
Kernel estimates of the tail index of a distribution , author =. 1985 , journal =. doi:10.1214/aos/1176349656 , url =
-
[23]
Maximum likelihood estimation in a class of nonregular cases , author =. 1985 , journal =. doi:10.1093/biomet/72.1.67 , url =
-
[24]
J. R.M. Hosking and J. R. Wallis , year =. Parameter and quantile estimation for the generalized. Technometrics , publisher =. doi:10.1080/00401706.1987.10488243 , url =
-
[25]
1987 , publisher =
Extreme values, regular variation, and point processes , author =. 1987 , publisher =
1987
-
[26]
, author =
Approximations in extreme value theory. , author =. 1987 , url =
1987
-
[27]
On the estimation of the extreme-value index and large quantile estimation , author =. 1989 , journal =. doi:10.1214/aos/1176347396 , url =
-
[28]
1989 , journal =
A method for the statistical definition of extreme-value regions and their application to meteorological time series , author =. 1989 , journal =
1989
-
[29]
Models for exceedances over high thresholds (with
Davison, Anthony C and Smith, Richard L , year =. Models for exceedances over high thresholds (with. Journal of the Royal Statistical Society. Series B , publisher =
-
[30]
Models for exceedances over high thresholds (with Discussion) , author =. 1990 , journal =. doi:10.1111/j.2517-6161.1990.tb01796.x , url =
-
[31]
L -moments: Analysis and estimation of distributions using linear combinations of order statistics , author =. 1990 , journal =. doi:10.1111/j.2517-6161.1990.tb01775.x , url =
-
[32]
Using the bootstrap to estimate mean squared error and select smoothing parameter in nonparametric problems , author =. 1990 , journal =. doi:10.1016/0047-259x(90)90080-2 , url =
-
[33]
Sinclair, C.D. and Spurr, B.D. and M.I. Ahmad , year =. Modified. Communications in Statistics --- Theory and Methods , publisher =. doi:10.1080/03610929008830405 , url =
-
[34]
Interannual variability of surface climate in the
Marengo, Jos. Interannual variability of surface climate in the. 1992 , journal =
1992
-
[35]
Further work on the prediction of northeast
Hastenrath, Stefan and Greischar, Lawrence , year =. Further work on the prediction of northeast. Journal of Climate , volume =
-
[36]
1993 , booktitle =
The relationship of drought frequency and duration to time scales , author =. 1993 , booktitle =
1993
-
[37]
Scott D. Grimshaw , year =. Computing maximum likelihood estimates for the generalized. Technometrics , volume =. doi:10.1080/00401706.1993.10485040 , url =
-
[38]
Bias correction in
Cordeiro, Gauss M and Klein, Ruben , year =. Bias correction in. Statistics and Probability Letters , publisher =
-
[39]
Mason, David M. and Turova, Tatyana S. , year =. Weak convergence of the. Extreme Value Theory and Applications: Proceedings of the Conference on Extreme Value Theory and Applications , publisher =. doi:10.1007/978-1-4613-3638-9\_25 , isbn =
-
[40]
Bootstrap Recycling: A Monte Carlo Alternative to the Nested Bootstrap , author =. 1994 , journal =. doi:10.2307/2290915 , url =
-
[41]
1995 , journal =
Controlling The False Discovery Rate---A Practical And Powerful Approach To Multiple Testing , author =. 1995 , journal =
1995
-
[42]
Coles, Stuart G and Tawn, Jonathan A , year =. A. Applied Statistics , publisher =
-
[43]
Second-order regular variation and rates of convergence in extreme-value theory , author =. 1996 , journal =. doi:10.1214/aop/1042644709 , url =
-
[44]
1996 , journal =
Generalized regular variation of second order , author =. 1996 , journal =
1996
-
[45]
1996 , journal =
Excess functions and estimation of the extreme-value index , author =. 1996 , journal =
1996
-
[46]
Jan Beirlant and Petra Vynckier and Jozef L. Teugels , year =. Tail index estimation,. Journal of the American Statistical Association , publisher =. doi:10.1080/01621459.1996.10476735 , url =
-
[47]
Marie Kratz and Sidney I. Resnick , year =. The. Communications in Statistics. Stochastic Models , publisher =. doi:10.1080/15326349608807407 , url =
-
[48]
1997 , publisher =
Bootstrap Methods and Their Application , author =. 1997 , publisher =
1997
-
[49]
1997 , journal =
Extreme deviations and applications , author =. 1997 , journal =
1997
-
[50]
Regional Frequency Analysis: An Approach Based on L -Moments , author =. 1997 , address =. doi:10.1017/cbo9780511529443 , url =
-
[51]
1997 , publisher =
Modelling Extremal Events for Insurance and Finance , author =. 1997 , publisher =
1997
-
[52]
Comparison of tail index estimators , author =. 1998 , journal =. doi:10.1111/1467-9574.00068 , url =
-
[53]
Selecting the optimal sample fraction in univariate extreme value estimation , author =. 1998 , journal =. doi:10.1016/s0304-4149(98)00017-9 , url =
-
[54]
Estimating a tail exponent by modelling departure from a
Andrey Feuerverger and Peter Hall , year =. Estimating a tail exponent by modelling departure from a. The Annals of Statistics , volume =. doi:10.1214/aos/1018031215 , url =
-
[55]
Tail index estimation and an exponential regression model , author =. 1999 , journal =. doi:10.1023/a:1009975020370 , url =
-
[56]
Statistics for modeling heavy tailed distributions in geology: Part I. Methodology , author =. 1999 , journal =. doi:10.1023/a:1007538624271 , url =
-
[57]
A bootstrap-based method to achieve optimality in estimating the extreme-value index , author =. 1999 , journal =. doi:10.1023/a:1009900215680 , url =
-
[58]
Exceedances over high thresholds: a guide to threshold selection , author =. 1999 , journal =. doi:10.1023/a:1009914915709 , url =
-
[59]
Climate variability in southern
Grimm, Alice M and Barros, Vicente R and Doyle, Moira E , year =. Climate variability in southern. Journal of Climate , volume =
-
[60]
Alternatives to a Semi-Parametric Estimator of Parameters of Rare Events---The Jackknife Methodology , author =. 2000 , journal =. doi:10.1023/a:1011470010228 , url =
-
[61]
2000 , journal =
Impacts of extreme weather and climate on terrestrial biota , author =. 2000 , journal =
2000
-
[62]
Multivariate
Wolter, Klaus , year =. Multivariate
-
[63]
2001 , journal =
The Control of The False Discovery Rate in Multiple Testing Under Dependency , author =. 2001 , journal =
2001
-
[64]
Coles, Stuart , year =. An. doi:10.1007/978-1-4471-3675-0 , isbn =
-
[65]
A predictive approach to tail probability estimation , author =. 2001 , journal =. doi:10.1023/a:1016546027962 , url =
-
[66]
The bootstrap methodology in statistics of extremes---choice of the optimal sample fraction , author =. 2001 , journal =. doi:10.1023/a:1016592028871 , url =
-
[67]
A diagnostic for selecting the threshold in extreme value analysis , author =. 2001 , journal =. doi:10.1111/1467-9868.00286 , url =
-
[68]
Using a bootstrap method to choose the sample fraction in tail index estimation , author =. 2001 , journal =. doi:10.1006/jmva.2000.1903 , url =
-
[69]
Goodness-of-fit tests for the generalized
Vartan Choulakian and Michael A Stephens , year =. Goodness-of-fit tests for the generalized. Technometrics , publisher =. doi:10.1198/00401700152672573 , url =
-
[70]
On exponential representations of log-spacings of extreme order statistics , author =. 2002 , journal =. doi:10.1023/a:1022171205129 , url =
-
[71]
A dynamic mixture model for unsupervised tail estimation without threshold selection , author =. 2002 , journal =. doi:10.1023/a:1024072610684 , url =
-
[72]
Spiegelhalter, David J. and Best, Nicola G. and Carlin, Bradley P. and Van Der Linde, Angelika , year =. Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society: Series B (Statistical Methodology) , volume =. doi:10.1111/1467-9868.00353 , url =
-
[73]
Non-parametric bootstrap recycling , author =. 2002 , journal =. doi:10.1023/a:1020754911317 , url =
-
[74]
2003 , journal =
Inference for clusters of extreme values , author =. 2003 , journal =
2003
-
[75]
2003 , journal =
Anticipating catastrophes through extreme value modelling , author =. 2003 , journal =
2003
-
[76]
Dupuis and Maria-Pia Victoria-Feser , year =
Debbie J. Dupuis and Maria-Pia Victoria-Feser , year =. A prediction error criterion for choosing the lower quantile in
-
[77]
Grimm, Alice M , year =. The. Journal of Climate , volume =
-
[78]
Guidelines on
Plummer, Neil and Allsopp, Terry and Lopez, Jos. Guidelines on. 2003 , publisher =
2003
-
[79]
Bayesian analysis of extreme events with threshold estimation , author =. 2004 , journal =. doi:10.1191/1471082X04st075oa , url =
-
[80]
Reiss and Thomas' automatic selection of the number of extremes , author =. 2004 , journal =. doi:10.1016/j.csda.2003.11.011 , url =
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.