On the optimal prediction of extreme events
Pith reviewed 2026-06-26 00:56 UTC · model grok-4.3
The pith
The asymptotically optimal positive homogeneous predictor of extreme Y given X is the non-extreme conditional quantile of a tilted distribution derived from the angular measure.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Under multivariate regular variation of (Y,X), the asymptotic prediction precision of any positive homogeneous predictor h(X) equals the tail dependence coefficient λ(Y,h(X)), expressed as an integral over the angular measure of the pair. The h that maximizes this coefficient is the non-extreme conditional quantile of the distribution obtained by tilting the angular measure; the associated peaks-over-threshold estimators are universally consistent over large classes of angular measures.
What carries the argument
The angular measure of the multivariate regular variation of (Y,X), which encodes all tail dependence; the optimal predictor is recovered as the conditional quantile of the distribution tilted by this measure.
If this is right
- Any positive homogeneous predictor's performance is exactly the tail dependence coefficient expressed as an integral functional of the angular measure.
- The variational problem for the optimal predictor admits an explicit solution via the tilted conditional quantile for arbitrary angular measures.
- Peaks-over-threshold estimators constructed from the tilted quantile are consistent without further parametric restrictions on the angular measure.
- The resulting procedure matches or exceeds oracle performance on the solar flare prediction task.
Where Pith is reading between the lines
- The same tilting construction could be used to derive optimal predictors when several response variables are extreme simultaneously.
- Replacing the homogeneous restriction with a more flexible class of functions might improve finite-sample performance while retaining the regular-variation tail analysis.
- The method supplies a natural benchmark against which existing extreme regression techniques can be compared on data sets with verifiable angular measures.
Load-bearing premise
The pair (Y,X) must be multivariate regularly varying so that its tail dependence structure is fully captured by a finite angular measure on the unit sphere.
What would settle it
A simulation with a known angular measure in which the tail dependence coefficient achieved by the estimated predictor is strictly less than the theoretical maximum computed directly from the angular measure.
Figures
read the original abstract
The prediction of the extremely large values of a response variable $Y$ in terms of a vector of covariates $X=(X_i)_{i=1}^d$ is a fundamental problem arising in many scientific and engineering domains. The scarcity of data in the extremes makes the optimal solution of this problem of particular importance. The optimal predictors of such events can be explicitly characterized in just a few cases and it is of fundamental practical and theoretical interest to develop optimal estimators over large classes of models and predictors. In this work, the focus is on the case where $(Y,X)$ have a multivariate regularly varying distribution and one seeks an optimal predictor expressed as a positive homogeneous function $h(X)$ of the covariates. The asymptotic prediction precision in this setting coincides with the tail-dependence coefficient $\lambda(Y,h(X))$ and it can be expressed as an integral functional of the associated angular measure of $(Y,X)$. Thus, finding asymptotically optimal homogeneous predictors amounts to solving a variational problem. We obtain a general solution to this problem, which is expressed in terms of a non-extreme conditional quantile of a tilted distribution derived from the angular measure. This leads to a general inference methodology for the optimal predictors in the peaks-over-threshold framework form extreme value theory. We establish the universal consistency for these estimators over large classes of angular measures. A general-purpose implementation of the resulting inference procedure is shown to work remarkably well against optimal oracle estimators, as well as in the challenging problem of extreme solar flare prediction.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper considers prediction of extreme values of a response Y given covariates X under the multivariate regular variation assumption on the joint distribution. It shows that the asymptotically optimal positive homogeneous predictor h(X) is obtained by solving a variational problem whose objective is the tail-dependence coefficient expressed as an integral functional of the angular measure; the solution is characterized as a non-extreme conditional quantile of a tilted distribution derived from that measure. Peaks-over-threshold estimators of this predictor are proved to be universally consistent over large classes of angular measures. A general-purpose implementation is presented and shown to perform well relative to oracle estimators, with an application to extreme solar-flare prediction.
Significance. If the central claims hold, the work supplies an explicit, theoretically justified construction for optimal homogeneous extreme predictors together with consistent estimators that apply to broad classes of tail-dependence structures. The reduction of the variational problem to a conditional quantile of a tilted measure, the universal-consistency result, and the reproducible numerical comparison against oracles are concrete strengths that advance both the theory and practice of extreme-value prediction.
minor comments (2)
- The abstract states that the estimators are 'universally consistent over large classes of angular measures,' but the precise definition of these classes (e.g., continuity or support conditions on the angular measure) is not visible in the provided summary; a short clarifying sentence in the introduction would help readers assess the scope.
- The numerical section reports that the procedure 'works remarkably well against optimal oracle estimators,' yet no quantitative table or figure reference is given in the abstract; adding a brief statement of the reported error metrics would strengthen the claim.
Simulated Author's Rebuttal
We thank the referee for the positive assessment of our work on optimal homogeneous extreme predictors and for recommending minor revision. No specific major comments appear in the report, so there are no points requiring point-by-point rebuttal. We will incorporate any minor editorial or presentational suggestions in the revised manuscript.
Circularity Check
No significant circularity
full rationale
The paper's derivation begins from the standard multivariate regular variation assumption on (Y,X), expresses the asymptotic prediction error as the tail dependence coefficient λ(Y,h(X)) which is an integral functional of the angular measure, poses the search for optimal positive homogeneous h as a well-posed variational problem, and solves it by exhibiting an explicit non-extreme conditional quantile of a tilted version of that measure. The subsequent POT estimator consistency is established directly over large classes of angular measures without any reduction of the claimed optimum to a fitted parameter, without self-referential definitions, and without load-bearing self-citations that would make the central result equivalent to its inputs by construction. The argument is therefore self-contained.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The random vector (Y,X) is multivariate regularly varying.
Reference graph
Works this paper leans on
-
[1]
and Sabourin, A
Aghbalou, A., Portier, F. and Sabourin, A. [2024], Sharp error bounds for imbalanced classifi- cation: How many examples in the minority class?,in‘Proceedings of The 27th International Conference on Artificial Intelligence and Statistics’, Vol. 238 ofProceedings of Machine Learn- ing Research, PMLR, pp. 838–846. URL:https://proceedings.mlr.press/v238/aghb...
2024
-
[2]
Basrak, B., Davis, R. A. and Mikosch, T. [2002], ‘A characterization of multivariate regular variation’,Ann. Appl. Probab.12(3), 908–920. URL:http://dx.doi.org/10.1214/aoap/1031863174
-
[3]
Basrak, B., Milinˇ cevi´ c, N. and Molchanov, I. [2025], ‘Foundations of regular variation on topological spaces’. URL:https://arxiv.org/abs/2503.00921
Pith/arXiv arXiv 2025
-
[4]
Beirlant, J., Goegebeur, Y., Teugels, J. and Segers, J. [2004],Statistics of extremes, Wiley Series in Probability and Statistics, John Wiley & Sons, Ltd., Chichester. Theory and applications, With contributions from Daniel De Waal and Chris Ferro. URL:http://dx.doi.org/10.1002/0470012382
-
[5]
H., Goldie, C
Bingham, N. H., Goldie, C. M. and Teugels, J. L. [1987],Regular Variation, Encyclopedia of Mathematics and its Applications, Cambridge University Press
1987
-
[6]
Bobbia, B., Dombry, C. and Varron, D. [2025], ‘A donsker and glivenko-cantelli theorem for random measures linked to extreme value theory’,Scandinavian Journal of Statistics 52(4), 1708–1734. URL:https://onlinelibrary.wiley.com/doi/abs/10.1111/sjos.70007
-
[7]
Bobra, M. G. and Couvidat, S. [2015], ‘Solar flare prediction using SDO/HMI vector magnetic field data with a machine-learning algorithm’,The Astrophysical Journal798(2), 135. URL:https://dx.doi.org/10.1088/0004-637X/798/2/135
-
[8]
Bobra, M. G., Sun, X., Hoeksema, J. T., Turmon, M., Liu, Y., Hayashi, K., Barnes, G. and Leka, K. D. [2014], ‘The Helioseismic and Magnetic Imager (HMI) vector magnetic field pipeline: SHARPs –space-weather HMI active region patches’,Solar Physics289(9), 3549– 3578. URL:https://doi.org/10.1007/s11207-014-0529-3
-
[9]
G., Wright, P
Bobra, M. G., Wright, P. J., Turmon, M. J., Vertanen, J., Dissauer, K., Schrijver, C. J., Cheung, M. C. M., Wheatland, M. S., Leka, K. D. and Barnes, G. [2021], ‘SMARTs and SHARPs: Two solar cycles of active region data’,The Astrophysical Journal Supplement Series256(2), 26
2021
-
[10]
and Davison, A
Boldi, M.-O. and Davison, A. C. [2007], ‘A mixture model for multivariate extremes’,Journal of the Royal Statistical Society: Series B (Statistical Methodology)69(2), 217–229
2007
-
[11]
and Sabourin, A
Cl´ emen¸ con, S., Huet, N. and Sabourin, A. [2025], ‘On regression in extreme regions’,Electronic Journal of Statistics19(2), 4784–4828
2025
-
[12]
and Segers, J
Cl´ emen¸ con, S., Jalalzai, H., Lhaut, S., Sabourin, A. and Segers, J. [2023], ‘Concentration bounds for the empirical angular measure with statistical learning applications’,Bernoulli 29(4), 2797–2827
2023
-
[13]
and Sabourin, A
Cl´ emen¸ con, S. and Sabourin, A. [2026], ‘Weak signals and heavy tails: Learning theory meets extreme value analysis’,Extremes
2026
-
[14]
Connor, R. J. and Mosimann, J. E. [1969], ‘Concepts of independence for proportions with a /Optimal prediction of extreme events35 generalization of the Dirichlet distribution’,Journal of the American Statistical Association 64(325), 194–206
1969
-
[15]
and Strokorb, K
Corradini, M. and Strokorb, K. [2024], ‘Stochastic ordering in multivariate extremes’,Extremes 27, 357–396
2024
-
[16]
Davis, R. A. and Mikosch, T. [2008], ‘Extreme value theory for space-time processes with heavy-tailed distributions’,Stochastic Process. Appl.118(4), 560–584. URL:http://dx.doi.org/10.1016/j.spa.2007.06.001
-
[17]
and Lugosi, G
Devroye, L., Gy¨ orfi, L. and Lugosi, G. [1996],A Probabilistic Theory of Pattern Recognition, Vol. 31 ofStochastic Modelling and Applied Probability, Springer, New York
1996
-
[18]
Dombry, C. and Ribatet, M. [2015], ‘Functional regular variations, Pareto processes and peaks over threshold’,Stat. Interface8(1), 9–17. URL:https://doi.org/10.4310/SII.2015.v8.n1.a2
-
[19]
Dudley, R. M. [1989],Real Analysis and Probability, Wadsworth and Brook/Cole
1989
-
[20]
and Mikosch, T
Dyszewski, P. and Mikosch, T. [2020], ‘Homogeneous mappings of regularly varying vectors’, Ann. Appl. Probab.30, 2999–3026. URL:https: // doi. org/ 10. 1214/ 20-AAP1579
2020
-
[21]
and Maume-Deschamps, V
Elie-Dit-Cosaque, K. and Maume-Deschamps, V. [2022], ‘Random forest estimation of con- ditional distribution functions and conditional quantiles’,Electronic Journal of Statistics 16(2), 6553–6583
2022
-
[22]
and Gijbels, I
Fan, J. and Gijbels, I. [1996],Local Polynomial Modelling and Its Applications, Vol. 66 of Monographs on Statistics and Applied Probability, Chapman & Hall, London
1996
-
[23]
Foug` eres, A.-L., Nolan, J. P. and Rootz´ en, H. [2009], ‘Models for Dependent Ex- tremes Using Stable Mixtures’,Scandinavian Journal of Statistics36(1), 42–59. eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1111/j.1467-9469.2008.00613.x. URL:https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1467-9469.2008.00613.x
-
[24]
Hult, H. and Lindskog, F. [2006], ‘Regular variation for measures on metric spaces’,Publ. Inst. Math. (Beograd) (N.S.)80(94), 121–140. URL:http://dx.doi.org/10.2298/PIM0694121H
-
[25]
Irvin, J., Rajpurkar, P., Ko, M., Yu, Y., Ciurea-Ilcus, S., Chute, C., Marklund, H., Haghgoo, B., Ball, R., Shpanskaya, K., Lungren, M. P. and Ng, A. Y. [2019], CheXpert: A large chest radiograph dataset with uncertainty labels and expert comparison,in‘Proceedings of the AAAI Conference on Artificial Intelligence’, Vol. 33, pp. 590–597
2019
-
[26]
and Sabourin, A
JALALZAI, H., Cl´ emen¸ con, S. and Sabourin, A. [2018], On binary classification in extreme regions,inS. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi and R. Gar- nett, eds, ‘Advances in Neural Information Processing Systems’, Vol. 31, Curran Associates, Inc. URL:https: // proceedings. neurips. cc/ paper_ files/ paper/ 2018/ file/ 0ebcc7...
2018
-
[27]
Janßen, A., Neblung, S. and Stoev, S. [2023], ‘Tail-dependence, exceedance sets, and metric embeddings’,Extremes. URL:https://doi.org/10.1007/s10687-023-00471-z
-
[28]
[2015],Dependence Modeling with Copulas, Chapman & Hall/CRC Monographs on Statistics & Applied Probability, CRC Press, Boca Raton, FL
Joe, H. [2015],Dependence Modeling with Copulas, Chapman & Hall/CRC Monographs on Statistics & Applied Probability, CRC Press, Boca Raton, FL
2015
-
[29]
Kulik, R. and Soulier, P. [2020],Heavy-Tailed Time Series, Springer Series in Operations Research and Financial Engineering, Springer, New York, NY. URL:http://link.springer.com/10.1007/978-1-0716-0737-4
-
[30]
Le Cam, L. and Yang, G. L. [2000],Asymptotics in Statistics, Springer Series in Statistics, second edn, Springer-Verlag, New York. Some basic concepts. /Optimal prediction of extreme events36 URL:https://doi.org/10.1007/978-1-4612-1166-2
-
[31]
Lindskog, F., Resnick, S. I. and Roy, J. [2014], ‘Regularly varying measures on metric spaces: hidden regular variation and hidden jumps’,Probab. Surv.11, 270–314. URL:http://dx.doi.org/10.1214/14-PS231
-
[32]
McNeil, A. J. and Neˇ slehov´ a, J. [2009], ‘Multivariate Archimedean copulas,d-monotone func- tions andl 1-norm symmetric distributions’,Ann. Statist.37(5B), 3059–3097. URL:https://doi.org/10.1214/07-AOS556
-
[33]
Meinguet, T. and Segers, J. [2012], Regularly varying time series in Banach spaces. Preprint available from arXiv:1001.3262. URL:http://arxiv.org/pdf/1001.3262v1
Pith/arXiv arXiv 2012
-
[34]
[2006], ‘Quantile regression forests’,Journal of Machine Learning Research 7, 983–999
Meinshausen, N. [2006], ‘Quantile regression forests’,Journal of Machine Learning Research 7, 983–999. URL:https://www.jmlr.org/papers/v7/meinshausen06a.html
2006
-
[35]
Nguyen, X., Wainwright, M. J. and Jordan, M. I. [2010], ‘Estimating divergence functionals and the likelihood ratio by convex risk minimization’,IEEE Transactions on Information Theory56(11), 5847–5861
2010
-
[36]
swpc.noaa.gov/products/goes-x-ray-flux
NOAA Space Weather Prediction Center [2023], ‘GOES X-ray Flux Data’,https://www. swpc.noaa.gov/products/goes-x-ray-flux. Accessed: 2025-06-13
2023
-
[37]
[2002],A user’s guide to measure theoretic probability, Vol
Pollard, D. [2002],A user’s guide to measure theoretic probability, Vol. 8 ofCambridge Series in Statistical and Probabilistic Mathematics, Cambridge University Press, Cambridge
2002
-
[38]
Portnoy, S. and Koenker, R. [1997], ‘The Gaussian hare and the Laplacian tortoise: com- putability of squared-error versus absolute-error estimators’,Statistical Science12(4), 279 – 300. URL:https://doi.org/10.1214/ss/1030037960
-
[39]
[2007],Heavy-Tail Phenomena: Probabilistic and Statistical Modeling, number v
Resnick, S. [2007],Heavy-Tail Phenomena: Probabilistic and Statistical Modeling, number v. 10in‘Heavy-tail phenomena: probabilistic and statistical modeling’, Springer. URL:https://books.google.com/books?id=p8uq2QFw9PUC
2007
-
[40]
Resnick, S. I. [1987],Extreme Values, Regular Variation and Point Processes, Springer-Verlag, New York
1987
-
[41]
Resnick, S. I. [1999],A probability path, Birkh¨ auser Boston Inc., Boston, MA
1999
-
[42]
Resnick, S. I. [2024],The art of finding hidden risks, Springer, New York. Hidden Regular Variation in the 21st Century
2024
-
[43]
and Tong, X
Rigollet, P. and Tong, X. [2011], ‘Neyman–Pearson classification, convexity and stochastic constraints’,Journal of Machine Learning Research12, 2831–2855. URL:https://jmlr.org/papers/v12/rigollet11a.html
2011
-
[44]
and Naveau, P
Sabourin, A. and Naveau, P. [2014], ‘Bayesian Dirichlet mixture model for multivariate ex- tremes: A re-parametrization’,Computational Statistics & Data Analysis71, 542–567
2014
-
[45]
and Taqqu, M
Samorodnitsky, G. and Taqqu, M. S. [1994],Stable Non-Gaussian Processes: Stochastic Models with Infinite Variance, Chapman and Hall, New York, London
1994
-
[46]
Scheffler, H.-P. and Stoev, S. [2017], ‘Implicit extremes and implicit max-stable laws’,Extremes 20(2), 265–299. URL:https://doi.org/10.1007/s10687-016-0278-9
-
[47]
Schilling, R. L., Song, R. and Vondracek, Z. [2012],Bernstein Functions: Theory and Appli- cations, De Gruyter, Berlin, Boston. URL:https://doi.org/10.1515/9783110269338
-
[48]
[1960], ‘Bivariate extreme statistics
Sibuya, M. [1960], ‘Bivariate extreme statistics. I’,Ann. Inst. Statist. Math. Tokyo11, 195– 210. URL:https://doi.org/10.1007/bf01682329 /Optimal prediction of extreme events37
-
[49]
[2026a], ‘optXpred: R code for optimal extreme event prediction via homogoneous functions’
Stoev, S. [2026a], ‘optXpred: R code for optimal extreme event prediction via homogoneous functions’. Private. Ask the author for access. Provides R code and a Shiny app for optimal extreme event prediction via homogoneous functions. URL:https://github.com/cctoeb/optXpred.git
-
[50]
[2026b], ‘An R Shiny app illustrating the optXpred software for optimal extreme event prediction via homogeneous functions’
Stoev, S. [2026b], ‘An R Shiny app illustrating the optXpred software for optimal extreme event prediction via homogeneous functions’. URL:https://rada.stat.lsa.umich.edu/shiny/sstoev/optXpred/
-
[51]
and Kanamori, T
Sugiyama, M., Suzuki, T. and Kanamori, T. [2012],Density Ratio Estimation in Machine Learning, 1 edn, Cambridge University Press
2012
-
[52]
[2013], ‘A plug-in approach to Neyman–Pearson classification’,Journal of Machine Learning Research14(92), 3011–3040
Tong, X. [2013], ‘A plug-in approach to Neyman–Pearson classification’,Journal of Machine Learning Research14(92), 3011–3040. URL:https://jmlr.org/papers/v14/tong13a.html
2013
-
[53]
and Feng, Y
Tong, X., Xia, L., Wang, J. and Feng, Y. [2020], ‘Neyman–Pearson classification: Parametrics and sample size requirement’,Journal of Machine Learning Research21(12), 1–48. URL:https://jmlr.org/papers/v21/18-577.html
2020
-
[54]
Tsybakov, A. B. [2009],Introduction to Nonparametric Estimation, Springer series in statistics, Springer, Dordrecht. URL:https://cds.cern.ch/record/1315296
arXiv 2009
-
[55]
Valavi, R., Guillera-Arroita, G., Lahoz-Monfort, J. J. and Elith, J. [2022], ‘Predictive per- formance of presence-only species distribution models: A benchmark study with reproducible code’,Ecological Monographs92(1), e01486
2022
-
[56]
van der Vaart, A. W. [1998],Asymptotic statistics, Vol. 3 ofCambridge Series in Statistical and Probabilistic Mathematics, Cambridge University Press, Cambridge
1998
-
[57]
Verma, V., Stoev, S. and Chen, Y. [2024], ‘On the optimal prediction of extreme events in heavy-tailed time series with applications to solar flare forecasting’. URL:https://arxiv.org/abs/2407.11887
arXiv 2024
-
[58]
Wright, M. N. and Ziegler, A. [2017], ‘ranger: A fast implementation of random forests for high dimensional data in C++ and R’,Journal of Statistical Software77(1), 1–17
2017
-
[59]
and Jones, M
Yu, K. and Jones, M. C. [1998], ‘Local linear quantile regression’,Journal of the American Statistical Association93(441), 228–237. Supplements A: Examples A.1. Examples of optimal predictors In this section, we elaborate on the closed-form optimal predictors discussed in Section 2. Specifi- cally, we prove the stated characterizations of the optimal pred...
1998
-
[60]
There exists a positiveα >0such thata n ∼ℓ(n)n 1/α, for some slowly varying functionℓ(·)
-
[61]
The measureµsatisfies the scaling relation: µ(t·A) =t −αµ(A),for allt >0andA∈ B(τ).(B.3)
-
[62]
Consequently, the parameterα >0, referred to as the tail exponent ofZis unique, and so is the measureµ, up to a rescaling factor
If alsoZ∈RV({b n}, ν), then an bn →c∈(0,∞),andµ(A) =c −αν(A),for allA∈ B(τ). Consequently, the parameterα >0, referred to as the tail exponent ofZis unique, and so is the measureµ, up to a rescaling factor
-
[63]
We have thatZ∈RV α(Rk,{a n}, c Z, τ, σ)according to Definition 3.1, where cZ :=µ({τ >1})andσ(B) := 1 µ({τ >1}) µ({x/τ(x)∈B}), B∈ B(S τ)
-
[64]
The proof of this result can be derived from many excellent treatments in the literature
Conversely, ifZ∈RV α(Rk,{a n}, c Z, τ, σ)according to Definition 3.1, thenZ∈RV α({an}, µ) according to Definition B.1, where µ◦T −1 τ ((r,∞)×B)) =c Zr−ασ(B),for allr >0, B∈ B(S τ),(B.4) and whereT τ :R k \{τ= 0} →(0,∞)×S τ is the generalized polar coordinated homeomorphism defined asT τ(x) = (τ(x), x/τ(x)). The proof of this result can be derived from man...
-
[65]
quantiles
thatZ:= (Y, X)∈RV 1(R+ ×R d,{a (Z) n :=n}, c Z, τ, σ), where withc i := (bi, ai), we have cZ = pX i=1 τ(c i) = pX i=1 bi +∥a i∥. In this case, the angular measureσisdiscreteand takes the form: σ(dθ) = 1 cZ pX i=1 τ(c i)δci/τ(c i)(dθ).(C.2) We begin with a counterpart to Proposition 3.2 and for the sake of completeness, we provide a proof. Proposition C.1....
2010
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.