On the optimal prediction of extreme events

Benjamin Bobbia; Stilian Stoev

arxiv: 2606.26270 · v1 · pith:VNHYNVO6new · submitted 2026-06-24 · 🧮 math.ST · stat.ME· stat.TH

On the optimal prediction of extreme events

Benjamin Bobbia , Stilian Stoev This is my paper

Pith reviewed 2026-06-26 00:56 UTC · model grok-4.3

classification 🧮 math.ST stat.MEstat.TH

keywords multivariate regular variationangular measureextreme value theorypeaks-over-thresholdoptimal predictiontail dependence coefficientconditional quantilehomogeneous predictor

0 comments

The pith

The asymptotically optimal positive homogeneous predictor of extreme Y given X is the non-extreme conditional quantile of a tilted distribution derived from the angular measure.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper characterizes the best positive homogeneous predictors for extreme values of a response Y in terms of covariates X when observations are scarce in the tail. It works under the assumption that the joint distribution is multivariate regularly varying, so the tail dependence is summarized by an angular measure on the unit sphere. The asymptotic performance of any such predictor reduces to the tail dependence coefficient, an integral functional of that angular measure, which turns the search for the optimum into a variational problem. Solving the variational problem produces an explicit form for the optimal predictor as a conditional quantile taken from a tilted version of the angular distribution. This form immediately yields peaks-over-threshold estimators that the paper proves are consistent across wide classes of angular measures.

Core claim

Under multivariate regular variation of (Y,X), the asymptotic prediction precision of any positive homogeneous predictor h(X) equals the tail dependence coefficient λ(Y,h(X)), expressed as an integral over the angular measure of the pair. The h that maximizes this coefficient is the non-extreme conditional quantile of the distribution obtained by tilting the angular measure; the associated peaks-over-threshold estimators are universally consistent over large classes of angular measures.

What carries the argument

The angular measure of the multivariate regular variation of (Y,X), which encodes all tail dependence; the optimal predictor is recovered as the conditional quantile of the distribution tilted by this measure.

If this is right

Any positive homogeneous predictor's performance is exactly the tail dependence coefficient expressed as an integral functional of the angular measure.
The variational problem for the optimal predictor admits an explicit solution via the tilted conditional quantile for arbitrary angular measures.
Peaks-over-threshold estimators constructed from the tilted quantile are consistent without further parametric restrictions on the angular measure.
The resulting procedure matches or exceeds oracle performance on the solar flare prediction task.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same tilting construction could be used to derive optimal predictors when several response variables are extreme simultaneously.
Replacing the homogeneous restriction with a more flexible class of functions might improve finite-sample performance while retaining the regular-variation tail analysis.
The method supplies a natural benchmark against which existing extreme regression techniques can be compared on data sets with verifiable angular measures.

Load-bearing premise

The pair (Y,X) must be multivariate regularly varying so that its tail dependence structure is fully captured by a finite angular measure on the unit sphere.

What would settle it

A simulation with a known angular measure in which the tail dependence coefficient achieved by the estimated predictor is strictly less than the theoretical maximum computed directly from the angular measure.

Figures

Figures reproduced from arXiv: 2606.26270 by Benjamin Bobbia, Stilian Stoev.

**Figure 2.** Figure 2: Pairs of boxplots of the empirical tail dependence coefficient [PITH_FULL_IMAGE:figures/full_fig_p032_2.png] view at source ↗

**Figure 3.** Figure 3: Left panel: Optimal extremal precision λ (opt) G for predicting one component of the ParetoDirichlet model via a homogeneous function of the rest (C.9). Right panel: The empirical taildependence coefficients λˆ p for: (i) the asymptotically optimal Oracle predictor h (opt)(X) ∝ ∥X∥ (black solid line) (ii) a non-parametric estimator of the optimal predictor Yb := bh(X) based on a training and testing samp… view at source ↗

**Figure 4.** Figure 4: Time series of maximum daily X-ray flux (in Watts per square meter). The data is based [PITH_FULL_IMAGE:figures/full_fig_p068_4.png] view at source ↗

**Figure 5.** Figure 5: Evaluation metrics for the prediction of M-class (left panel) and X-class (right panel) solar [PITH_FULL_IMAGE:figures/full_fig_p070_5.png] view at source ↗

read the original abstract

The prediction of the extremely large values of a response variable $Y$ in terms of a vector of covariates $X=(X_i)_{i=1}^d$ is a fundamental problem arising in many scientific and engineering domains. The scarcity of data in the extremes makes the optimal solution of this problem of particular importance. The optimal predictors of such events can be explicitly characterized in just a few cases and it is of fundamental practical and theoretical interest to develop optimal estimators over large classes of models and predictors. In this work, the focus is on the case where $(Y,X)$ have a multivariate regularly varying distribution and one seeks an optimal predictor expressed as a positive homogeneous function $h(X)$ of the covariates. The asymptotic prediction precision in this setting coincides with the tail-dependence coefficient $\lambda(Y,h(X))$ and it can be expressed as an integral functional of the associated angular measure of $(Y,X)$. Thus, finding asymptotically optimal homogeneous predictors amounts to solving a variational problem. We obtain a general solution to this problem, which is expressed in terms of a non-extreme conditional quantile of a tilted distribution derived from the angular measure. This leads to a general inference methodology for the optimal predictors in the peaks-over-threshold framework form extreme value theory. We establish the universal consistency for these estimators over large classes of angular measures. A general-purpose implementation of the resulting inference procedure is shown to work remarkably well against optimal oracle estimators, as well as in the challenging problem of extreme solar flare prediction.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Paper reduces optimal homogeneous extreme prediction to a conditional quantile of a tilted angular measure and proves consistency of the resulting POT estimators.

read the letter

The main thing here is that the authors solve the variational problem for the best positive homogeneous predictor of extreme Y given X under multivariate regular variation. They show the solution is a non-extreme conditional quantile of a tilted version of the angular measure, and they give peaks-over-threshold estimators that are consistent over large classes of angular measures.

This extends the handful of special cases that were already known, and the reduction to the quantile form is explicit enough to be useful. The numerical checks against oracles and the solar-flare example indicate the procedure can be implemented and performs close to the theoretical optimum. The argument stays inside the standard regularly-varying framework, so the angular measure captures the tail dependence and the objective is the usual integral expression for the tail-dependence coefficient.

The soft spots are limited. The abstract and stress-test note give no sign of circularity or hidden assumptions that would break the logic, and consistency is claimed only for large classes rather than all measures. Still, without the full proofs it is hard to judge exactly how broad those classes are or how the tilting step behaves in moderate dimensions. The implementation details on threshold choice and finite-sample behavior would also need scrutiny.

This is for people already working with angular measures or extreme-value prediction who want an implementable optimal homogeneous rule rather than ad-hoc choices. A reader comfortable with multivariate regular variation will see the most value.

I would send it to peer review. The core reduction and consistency claim are new enough and the framing is careful enough that referees should evaluate the proofs and the practical scope.

Referee Report

0 major / 2 minor

Summary. The paper considers prediction of extreme values of a response Y given covariates X under the multivariate regular variation assumption on the joint distribution. It shows that the asymptotically optimal positive homogeneous predictor h(X) is obtained by solving a variational problem whose objective is the tail-dependence coefficient expressed as an integral functional of the angular measure; the solution is characterized as a non-extreme conditional quantile of a tilted distribution derived from that measure. Peaks-over-threshold estimators of this predictor are proved to be universally consistent over large classes of angular measures. A general-purpose implementation is presented and shown to perform well relative to oracle estimators, with an application to extreme solar-flare prediction.

Significance. If the central claims hold, the work supplies an explicit, theoretically justified construction for optimal homogeneous extreme predictors together with consistent estimators that apply to broad classes of tail-dependence structures. The reduction of the variational problem to a conditional quantile of a tilted measure, the universal-consistency result, and the reproducible numerical comparison against oracles are concrete strengths that advance both the theory and practice of extreme-value prediction.

minor comments (2)

The abstract states that the estimators are 'universally consistent over large classes of angular measures,' but the precise definition of these classes (e.g., continuity or support conditions on the angular measure) is not visible in the provided summary; a short clarifying sentence in the introduction would help readers assess the scope.
The numerical section reports that the procedure 'works remarkably well against optimal oracle estimators,' yet no quantitative table or figure reference is given in the abstract; adding a brief statement of the reported error metrics would strengthen the claim.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive assessment of our work on optimal homogeneous extreme predictors and for recommending minor revision. No specific major comments appear in the report, so there are no points requiring point-by-point rebuttal. We will incorporate any minor editorial or presentational suggestions in the revised manuscript.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper's derivation begins from the standard multivariate regular variation assumption on (Y,X), expresses the asymptotic prediction error as the tail dependence coefficient λ(Y,h(X)) which is an integral functional of the angular measure, poses the search for optimal positive homogeneous h as a well-posed variational problem, and solves it by exhibiting an explicit non-extreme conditional quantile of a tilted version of that measure. The subsequent POT estimator consistency is established directly over large classes of angular measures without any reduction of the claimed optimum to a fitted parameter, without self-referential definitions, and without load-bearing self-citations that would make the central result equivalent to its inputs by construction. The argument is therefore self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the regular-variation assumption that supplies the angular measure; no free parameters, additional axioms, or invented entities are introduced in the abstract.

axioms (1)

domain assumption The random vector (Y,X) is multivariate regularly varying.
This is the explicit modeling assumption stated in the abstract that enables the angular-measure representation and the variational problem.

pith-pipeline@v0.9.1-grok · 5792 in / 1284 out tokens · 29888 ms · 2026-06-26T00:56:50.998475+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

65 extracted references · 18 canonical work pages

[1]

and Sabourin, A

Aghbalou, A., Portier, F. and Sabourin, A. [2024], Sharp error bounds for imbalanced classifi- cation: How many examples in the minority class?,in‘Proceedings of The 27th International Conference on Artificial Intelligence and Statistics’, Vol. 238 ofProceedings of Machine Learn- ing Research, PMLR, pp. 838–846. URL:https://proceedings.mlr.press/v238/aghb...

2024
[2]

Basrak, B., Davis, R. A. and Mikosch, T. [2002], ‘A characterization of multivariate regular variation’,Ann. Appl. Probab.12(3), 908–920. URL:http://dx.doi.org/10.1214/aoap/1031863174

work page doi:10.1214/aoap/1031863174 2002
[3]

and Molchanov, I

Basrak, B., Milinˇ cevi´ c, N. and Molchanov, I. [2025], ‘Foundations of regular variation on topological spaces’. URL:https://arxiv.org/abs/2503.00921

Pith/arXiv arXiv 2025
[4]

and Segers, J

Beirlant, J., Goegebeur, Y., Teugels, J. and Segers, J. [2004],Statistics of extremes, Wiley Series in Probability and Statistics, John Wiley & Sons, Ltd., Chichester. Theory and applications, With contributions from Daniel De Waal and Chris Ferro. URL:http://dx.doi.org/10.1002/0470012382

work page doi:10.1002/0470012382 2004
[5]

H., Goldie, C

Bingham, N. H., Goldie, C. M. and Teugels, J. L. [1987],Regular Variation, Encyclopedia of Mathematics and its Applications, Cambridge University Press

1987
[6]

and Varron, D

Bobbia, B., Dombry, C. and Varron, D. [2025], ‘A donsker and glivenko-cantelli theorem for random measures linked to extreme value theory’,Scandinavian Journal of Statistics 52(4), 1708–1734. URL:https://onlinelibrary.wiley.com/doi/abs/10.1111/sjos.70007

work page doi:10.1111/sjos.70007 2025
[7]

Bobra, M. G. and Couvidat, S. [2015], ‘Solar flare prediction using SDO/HMI vector magnetic field data with a machine-learning algorithm’,The Astrophysical Journal798(2), 135. URL:https://dx.doi.org/10.1088/0004-637X/798/2/135

work page doi:10.1088/0004-637x/798/2/135 2015
[8]

G., Sun, X., Hoeksema, J

Bobra, M. G., Sun, X., Hoeksema, J. T., Turmon, M., Liu, Y., Hayashi, K., Barnes, G. and Leka, K. D. [2014], ‘The Helioseismic and Magnetic Imager (HMI) vector magnetic field pipeline: SHARPs –space-weather HMI active region patches’,Solar Physics289(9), 3549– 3578. URL:https://doi.org/10.1007/s11207-014-0529-3

work page doi:10.1007/s11207-014-0529-3 2014
[9]

G., Wright, P

Bobra, M. G., Wright, P. J., Turmon, M. J., Vertanen, J., Dissauer, K., Schrijver, C. J., Cheung, M. C. M., Wheatland, M. S., Leka, K. D. and Barnes, G. [2021], ‘SMARTs and SHARPs: Two solar cycles of active region data’,The Astrophysical Journal Supplement Series256(2), 26

2021
[10]

and Davison, A

Boldi, M.-O. and Davison, A. C. [2007], ‘A mixture model for multivariate extremes’,Journal of the Royal Statistical Society: Series B (Statistical Methodology)69(2), 217–229

2007
[11]

and Sabourin, A

Cl´ emen¸ con, S., Huet, N. and Sabourin, A. [2025], ‘On regression in extreme regions’,Electronic Journal of Statistics19(2), 4784–4828

2025
[12]

and Segers, J

Cl´ emen¸ con, S., Jalalzai, H., Lhaut, S., Sabourin, A. and Segers, J. [2023], ‘Concentration bounds for the empirical angular measure with statistical learning applications’,Bernoulli 29(4), 2797–2827

2023
[13]

and Sabourin, A

Cl´ emen¸ con, S. and Sabourin, A. [2026], ‘Weak signals and heavy tails: Learning theory meets extreme value analysis’,Extremes

2026
[14]

Connor, R. J. and Mosimann, J. E. [1969], ‘Concepts of independence for proportions with a /Optimal prediction of extreme events35 generalization of the Dirichlet distribution’,Journal of the American Statistical Association 64(325), 194–206

1969
[15]

and Strokorb, K

Corradini, M. and Strokorb, K. [2024], ‘Stochastic ordering in multivariate extremes’,Extremes 27, 357–396

2024
[16]

Davis, R. A. and Mikosch, T. [2008], ‘Extreme value theory for space-time processes with heavy-tailed distributions’,Stochastic Process. Appl.118(4), 560–584. URL:http://dx.doi.org/10.1016/j.spa.2007.06.001

work page doi:10.1016/j.spa.2007.06.001 2008
[17]

and Lugosi, G

Devroye, L., Gy¨ orfi, L. and Lugosi, G. [1996],A Probabilistic Theory of Pattern Recognition, Vol. 31 ofStochastic Modelling and Applied Probability, Springer, New York

1996
[18]

and Ribatet, M

Dombry, C. and Ribatet, M. [2015], ‘Functional regular variations, Pareto processes and peaks over threshold’,Stat. Interface8(1), 9–17. URL:https://doi.org/10.4310/SII.2015.v8.n1.a2

work page doi:10.4310/sii.2015.v8.n1.a2 2015
[19]

Dudley, R. M. [1989],Real Analysis and Probability, Wadsworth and Brook/Cole

1989
[20]

and Mikosch, T

Dyszewski, P. and Mikosch, T. [2020], ‘Homogeneous mappings of regularly varying vectors’, Ann. Appl. Probab.30, 2999–3026. URL:https: // doi. org/ 10. 1214/ 20-AAP1579

2020
[21]

and Maume-Deschamps, V

Elie-Dit-Cosaque, K. and Maume-Deschamps, V. [2022], ‘Random forest estimation of con- ditional distribution functions and conditional quantiles’,Electronic Journal of Statistics 16(2), 6553–6583

2022
[22]

and Gijbels, I

Fan, J. and Gijbels, I. [1996],Local Polynomial Modelling and Its Applications, Vol. 66 of Monographs on Statistics and Applied Probability, Chapman & Hall, London

1996
[23]

Foug` eres, A.-L., Nolan, J. P. and Rootz´ en, H. [2009], ‘Models for Dependent Ex- tremes Using Stable Mixtures’,Scandinavian Journal of Statistics36(1), 42–59. eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1111/j.1467-9469.2008.00613.x. URL:https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1467-9469.2008.00613.x

work page doi:10.1111/j.1467-9469.2008.00613.x 2009
[24]

and Lindskog, F

Hult, H. and Lindskog, F. [2006], ‘Regular variation for measures on metric spaces’,Publ. Inst. Math. (Beograd) (N.S.)80(94), 121–140. URL:http://dx.doi.org/10.2298/PIM0694121H

work page doi:10.2298/pim0694121h 2006
[25]

Irvin, J., Rajpurkar, P., Ko, M., Yu, Y., Ciurea-Ilcus, S., Chute, C., Marklund, H., Haghgoo, B., Ball, R., Shpanskaya, K., Lungren, M. P. and Ng, A. Y. [2019], CheXpert: A large chest radiograph dataset with uncertainty labels and expert comparison,in‘Proceedings of the AAAI Conference on Artificial Intelligence’, Vol. 33, pp. 590–597

2019
[26]

and Sabourin, A

JALALZAI, H., Cl´ emen¸ con, S. and Sabourin, A. [2018], On binary classification in extreme regions,inS. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi and R. Gar- nett, eds, ‘Advances in Neural Information Processing Systems’, Vol. 31, Curran Associates, Inc. URL:https: // proceedings. neurips. cc/ paper_ files/ paper/ 2018/ file/ 0ebcc7...

2018
[27]

and Stoev, S

Janßen, A., Neblung, S. and Stoev, S. [2023], ‘Tail-dependence, exceedance sets, and metric embeddings’,Extremes. URL:https://doi.org/10.1007/s10687-023-00471-z

work page doi:10.1007/s10687-023-00471-z 2023
[28]

[2015],Dependence Modeling with Copulas, Chapman & Hall/CRC Monographs on Statistics & Applied Probability, CRC Press, Boca Raton, FL

Joe, H. [2015],Dependence Modeling with Copulas, Chapman & Hall/CRC Monographs on Statistics & Applied Probability, CRC Press, Boca Raton, FL

2015
[29]

and Soulier, P

Kulik, R. and Soulier, P. [2020],Heavy-Tailed Time Series, Springer Series in Operations Research and Financial Engineering, Springer, New York, NY. URL:http://link.springer.com/10.1007/978-1-0716-0737-4

work page doi:10.1007/978-1-0716-0737-4 2020
[30]

Le Cam and G.L

Le Cam, L. and Yang, G. L. [2000],Asymptotics in Statistics, Springer Series in Statistics, second edn, Springer-Verlag, New York. Some basic concepts. /Optimal prediction of extreme events36 URL:https://doi.org/10.1007/978-1-4612-1166-2

work page doi:10.1007/978-1-4612-1166-2 2000
[31]

Lindskog, F., Resnick, S. I. and Roy, J. [2014], ‘Regularly varying measures on metric spaces: hidden regular variation and hidden jumps’,Probab. Surv.11, 270–314. URL:http://dx.doi.org/10.1214/14-PS231

work page doi:10.1214/14-ps231 2014
[32]

McNeil, A. J. and Neˇ slehov´ a, J. [2009], ‘Multivariate Archimedean copulas,d-monotone func- tions andl 1-norm symmetric distributions’,Ann. Statist.37(5B), 3059–3097. URL:https://doi.org/10.1214/07-AOS556

work page doi:10.1214/07-aos556 2009
[33]

and Segers, J

Meinguet, T. and Segers, J. [2012], Regularly varying time series in Banach spaces. Preprint available from arXiv:1001.3262. URL:http://arxiv.org/pdf/1001.3262v1

Pith/arXiv arXiv 2012
[34]

[2006], ‘Quantile regression forests’,Journal of Machine Learning Research 7, 983–999

Meinshausen, N. [2006], ‘Quantile regression forests’,Journal of Machine Learning Research 7, 983–999. URL:https://www.jmlr.org/papers/v7/meinshausen06a.html

2006
[35]

Nguyen, X., Wainwright, M. J. and Jordan, M. I. [2010], ‘Estimating divergence functionals and the likelihood ratio by convex risk minimization’,IEEE Transactions on Information Theory56(11), 5847–5861

2010
[36]

swpc.noaa.gov/products/goes-x-ray-flux

NOAA Space Weather Prediction Center [2023], ‘GOES X-ray Flux Data’,https://www. swpc.noaa.gov/products/goes-x-ray-flux. Accessed: 2025-06-13

2023
[37]

[2002],A user’s guide to measure theoretic probability, Vol

Pollard, D. [2002],A user’s guide to measure theoretic probability, Vol. 8 ofCambridge Series in Statistical and Probabilistic Mathematics, Cambridge University Press, Cambridge

2002
[38]

and Koenker, R

Portnoy, S. and Koenker, R. [1997], ‘The Gaussian hare and the Laplacian tortoise: com- putability of squared-error versus absolute-error estimators’,Statistical Science12(4), 279 – 300. URL:https://doi.org/10.1214/ss/1030037960

work page doi:10.1214/ss/1030037960 1997
[39]

[2007],Heavy-Tail Phenomena: Probabilistic and Statistical Modeling, number v

Resnick, S. [2007],Heavy-Tail Phenomena: Probabilistic and Statistical Modeling, number v. 10in‘Heavy-tail phenomena: probabilistic and statistical modeling’, Springer. URL:https://books.google.com/books?id=p8uq2QFw9PUC

2007
[40]

Resnick, S. I. [1987],Extreme Values, Regular Variation and Point Processes, Springer-Verlag, New York

1987
[41]

Resnick, S. I. [1999],A probability path, Birkh¨ auser Boston Inc., Boston, MA

1999
[42]

Resnick, S. I. [2024],The art of finding hidden risks, Springer, New York. Hidden Regular Variation in the 21st Century

2024
[43]

and Tong, X

Rigollet, P. and Tong, X. [2011], ‘Neyman–Pearson classification, convexity and stochastic constraints’,Journal of Machine Learning Research12, 2831–2855. URL:https://jmlr.org/papers/v12/rigollet11a.html

2011
[44]

and Naveau, P

Sabourin, A. and Naveau, P. [2014], ‘Bayesian Dirichlet mixture model for multivariate ex- tremes: A re-parametrization’,Computational Statistics & Data Analysis71, 542–567

2014
[45]

and Taqqu, M

Samorodnitsky, G. and Taqqu, M. S. [1994],Stable Non-Gaussian Processes: Stochastic Models with Infinite Variance, Chapman and Hall, New York, London

1994
[46]

and Stoev, S

Scheffler, H.-P. and Stoev, S. [2017], ‘Implicit extremes and implicit max-stable laws’,Extremes 20(2), 265–299. URL:https://doi.org/10.1007/s10687-016-0278-9

work page doi:10.1007/s10687-016-0278-9 2017
[47]

Bernstein

Schilling, R. L., Song, R. and Vondracek, Z. [2012],Bernstein Functions: Theory and Appli- cations, De Gruyter, Berlin, Boston. URL:https://doi.org/10.1515/9783110269338

work page doi:10.1515/9783110269338 2012
[48]

[1960], ‘Bivariate extreme statistics

Sibuya, M. [1960], ‘Bivariate extreme statistics. I’,Ann. Inst. Statist. Math. Tokyo11, 195– 210. URL:https://doi.org/10.1007/bf01682329 /Optimal prediction of extreme events37

work page doi:10.1007/bf01682329 1960
[49]

[2026a], ‘optXpred: R code for optimal extreme event prediction via homogoneous functions’

Stoev, S. [2026a], ‘optXpred: R code for optimal extreme event prediction via homogoneous functions’. Private. Ask the author for access. Provides R code and a Shiny app for optimal extreme event prediction via homogoneous functions. URL:https://github.com/cctoeb/optXpred.git
[50]

[2026b], ‘An R Shiny app illustrating the optXpred software for optimal extreme event prediction via homogeneous functions’

Stoev, S. [2026b], ‘An R Shiny app illustrating the optXpred software for optimal extreme event prediction via homogeneous functions’. URL:https://rada.stat.lsa.umich.edu/shiny/sstoev/optXpred/
[51]

and Kanamori, T

Sugiyama, M., Suzuki, T. and Kanamori, T. [2012],Density Ratio Estimation in Machine Learning, 1 edn, Cambridge University Press

2012
[52]

[2013], ‘A plug-in approach to Neyman–Pearson classification’,Journal of Machine Learning Research14(92), 3011–3040

Tong, X. [2013], ‘A plug-in approach to Neyman–Pearson classification’,Journal of Machine Learning Research14(92), 3011–3040. URL:https://jmlr.org/papers/v14/tong13a.html

2013
[53]

and Feng, Y

Tong, X., Xia, L., Wang, J. and Feng, Y. [2020], ‘Neyman–Pearson classification: Parametrics and sample size requirement’,Journal of Machine Learning Research21(12), 1–48. URL:https://jmlr.org/papers/v21/18-577.html

2020
[54]

Tsybakov, A. B. [2009],Introduction to Nonparametric Estimation, Springer series in statistics, Springer, Dordrecht. URL:https://cds.cern.ch/record/1315296

arXiv 2009
[55]

Valavi, R., Guillera-Arroita, G., Lahoz-Monfort, J. J. and Elith, J. [2022], ‘Predictive per- formance of presence-only species distribution models: A benchmark study with reproducible code’,Ecological Monographs92(1), e01486

2022
[56]

van der Vaart, A. W. [1998],Asymptotic statistics, Vol. 3 ofCambridge Series in Statistical and Probabilistic Mathematics, Cambridge University Press, Cambridge

1998
[57]

and Chen, Y

Verma, V., Stoev, S. and Chen, Y. [2024], ‘On the optimal prediction of extreme events in heavy-tailed time series with applications to solar flare forecasting’. URL:https://arxiv.org/abs/2407.11887

arXiv 2024
[58]

Wright, M. N. and Ziegler, A. [2017], ‘ranger: A fast implementation of random forests for high dimensional data in C++ and R’,Journal of Statistical Software77(1), 1–17

2017
[59]

and Jones, M

Yu, K. and Jones, M. C. [1998], ‘Local linear quantile regression’,Journal of the American Statistical Association93(441), 228–237. Supplements A: Examples A.1. Examples of optimal predictors In this section, we elaborate on the closed-form optimal predictors discussed in Section 2. Specifi- cally, we prove the stated characterizations of the optimal pred...

1998
[60]

There exists a positiveα >0such thata n ∼ℓ(n)n 1/α, for some slowly varying functionℓ(·)
[61]

The measureµsatisfies the scaling relation: µ(t·A) =t −αµ(A),for allt >0andA∈ B(τ).(B.3)
[62]

Consequently, the parameterα >0, referred to as the tail exponent ofZis unique, and so is the measureµ, up to a rescaling factor

If alsoZ∈RV({b n}, ν), then an bn →c∈(0,∞),andµ(A) =c −αν(A),for allA∈ B(τ). Consequently, the parameterα >0, referred to as the tail exponent ofZis unique, and so is the measureµ, up to a rescaling factor
[63]

We have thatZ∈RV α(Rk,{a n}, c Z, τ, σ)according to Definition 3.1, where cZ :=µ({τ >1})andσ(B) := 1 µ({τ >1}) µ({x/τ(x)∈B}), B∈ B(S τ)
[64]

The proof of this result can be derived from many excellent treatments in the literature

Conversely, ifZ∈RV α(Rk,{a n}, c Z, τ, σ)according to Definition 3.1, thenZ∈RV α({an}, µ) according to Definition B.1, where µ◦T −1 τ ((r,∞)×B)) =c Zr−ασ(B),for allr >0, B∈ B(S τ),(B.4) and whereT τ :R k \{τ= 0} →(0,∞)×S τ is the generalized polar coordinated homeomorphism defined asT τ(x) = (τ(x), x/τ(x)). The proof of this result can be derived from man...
[65]

quantiles

thatZ:= (Y, X)∈RV 1(R+ ×R d,{a (Z) n :=n}, c Z, τ, σ), where withc i := (bi, ai), we have cZ = pX i=1 τ(c i) = pX i=1 bi +∥a i∥. In this case, the angular measureσisdiscreteand takes the form: σ(dθ) = 1 cZ pX i=1 τ(c i)δci/τ(c i)(dθ).(C.2) We begin with a counterpart to Proposition 3.2 and for the sake of completeness, we provide a proof. Proposition C.1....

2010

[1] [1]

and Sabourin, A

Aghbalou, A., Portier, F. and Sabourin, A. [2024], Sharp error bounds for imbalanced classifi- cation: How many examples in the minority class?,in‘Proceedings of The 27th International Conference on Artificial Intelligence and Statistics’, Vol. 238 ofProceedings of Machine Learn- ing Research, PMLR, pp. 838–846. URL:https://proceedings.mlr.press/v238/aghb...

2024

[2] [2]

Basrak, B., Davis, R. A. and Mikosch, T. [2002], ‘A characterization of multivariate regular variation’,Ann. Appl. Probab.12(3), 908–920. URL:http://dx.doi.org/10.1214/aoap/1031863174

work page doi:10.1214/aoap/1031863174 2002

[3] [3]

and Molchanov, I

Basrak, B., Milinˇ cevi´ c, N. and Molchanov, I. [2025], ‘Foundations of regular variation on topological spaces’. URL:https://arxiv.org/abs/2503.00921

Pith/arXiv arXiv 2025

[4] [4]

and Segers, J

Beirlant, J., Goegebeur, Y., Teugels, J. and Segers, J. [2004],Statistics of extremes, Wiley Series in Probability and Statistics, John Wiley & Sons, Ltd., Chichester. Theory and applications, With contributions from Daniel De Waal and Chris Ferro. URL:http://dx.doi.org/10.1002/0470012382

work page doi:10.1002/0470012382 2004

[5] [5]

H., Goldie, C

Bingham, N. H., Goldie, C. M. and Teugels, J. L. [1987],Regular Variation, Encyclopedia of Mathematics and its Applications, Cambridge University Press

1987

[6] [6]

and Varron, D

Bobbia, B., Dombry, C. and Varron, D. [2025], ‘A donsker and glivenko-cantelli theorem for random measures linked to extreme value theory’,Scandinavian Journal of Statistics 52(4), 1708–1734. URL:https://onlinelibrary.wiley.com/doi/abs/10.1111/sjos.70007

work page doi:10.1111/sjos.70007 2025

[7] [7]

Bobra, M. G. and Couvidat, S. [2015], ‘Solar flare prediction using SDO/HMI vector magnetic field data with a machine-learning algorithm’,The Astrophysical Journal798(2), 135. URL:https://dx.doi.org/10.1088/0004-637X/798/2/135

work page doi:10.1088/0004-637x/798/2/135 2015

[8] [8]

G., Sun, X., Hoeksema, J

Bobra, M. G., Sun, X., Hoeksema, J. T., Turmon, M., Liu, Y., Hayashi, K., Barnes, G. and Leka, K. D. [2014], ‘The Helioseismic and Magnetic Imager (HMI) vector magnetic field pipeline: SHARPs –space-weather HMI active region patches’,Solar Physics289(9), 3549– 3578. URL:https://doi.org/10.1007/s11207-014-0529-3

work page doi:10.1007/s11207-014-0529-3 2014

[9] [9]

G., Wright, P

Bobra, M. G., Wright, P. J., Turmon, M. J., Vertanen, J., Dissauer, K., Schrijver, C. J., Cheung, M. C. M., Wheatland, M. S., Leka, K. D. and Barnes, G. [2021], ‘SMARTs and SHARPs: Two solar cycles of active region data’,The Astrophysical Journal Supplement Series256(2), 26

2021

[10] [10]

and Davison, A

Boldi, M.-O. and Davison, A. C. [2007], ‘A mixture model for multivariate extremes’,Journal of the Royal Statistical Society: Series B (Statistical Methodology)69(2), 217–229

2007

[11] [11]

and Sabourin, A

Cl´ emen¸ con, S., Huet, N. and Sabourin, A. [2025], ‘On regression in extreme regions’,Electronic Journal of Statistics19(2), 4784–4828

2025

[12] [12]

and Segers, J

Cl´ emen¸ con, S., Jalalzai, H., Lhaut, S., Sabourin, A. and Segers, J. [2023], ‘Concentration bounds for the empirical angular measure with statistical learning applications’,Bernoulli 29(4), 2797–2827

2023

[13] [13]

and Sabourin, A

Cl´ emen¸ con, S. and Sabourin, A. [2026], ‘Weak signals and heavy tails: Learning theory meets extreme value analysis’,Extremes

2026

[14] [14]

Connor, R. J. and Mosimann, J. E. [1969], ‘Concepts of independence for proportions with a /Optimal prediction of extreme events35 generalization of the Dirichlet distribution’,Journal of the American Statistical Association 64(325), 194–206

1969

[15] [15]

and Strokorb, K

Corradini, M. and Strokorb, K. [2024], ‘Stochastic ordering in multivariate extremes’,Extremes 27, 357–396

2024

[16] [16]

Davis, R. A. and Mikosch, T. [2008], ‘Extreme value theory for space-time processes with heavy-tailed distributions’,Stochastic Process. Appl.118(4), 560–584. URL:http://dx.doi.org/10.1016/j.spa.2007.06.001

work page doi:10.1016/j.spa.2007.06.001 2008

[17] [17]

and Lugosi, G

Devroye, L., Gy¨ orfi, L. and Lugosi, G. [1996],A Probabilistic Theory of Pattern Recognition, Vol. 31 ofStochastic Modelling and Applied Probability, Springer, New York

1996

[18] [18]

and Ribatet, M

Dombry, C. and Ribatet, M. [2015], ‘Functional regular variations, Pareto processes and peaks over threshold’,Stat. Interface8(1), 9–17. URL:https://doi.org/10.4310/SII.2015.v8.n1.a2

work page doi:10.4310/sii.2015.v8.n1.a2 2015

[19] [19]

Dudley, R. M. [1989],Real Analysis and Probability, Wadsworth and Brook/Cole

1989

[20] [20]

and Mikosch, T

Dyszewski, P. and Mikosch, T. [2020], ‘Homogeneous mappings of regularly varying vectors’, Ann. Appl. Probab.30, 2999–3026. URL:https: // doi. org/ 10. 1214/ 20-AAP1579

2020

[21] [21]

and Maume-Deschamps, V

Elie-Dit-Cosaque, K. and Maume-Deschamps, V. [2022], ‘Random forest estimation of con- ditional distribution functions and conditional quantiles’,Electronic Journal of Statistics 16(2), 6553–6583

2022

[22] [22]

and Gijbels, I

Fan, J. and Gijbels, I. [1996],Local Polynomial Modelling and Its Applications, Vol. 66 of Monographs on Statistics and Applied Probability, Chapman & Hall, London

1996

[23] [23]

Foug` eres, A.-L., Nolan, J. P. and Rootz´ en, H. [2009], ‘Models for Dependent Ex- tremes Using Stable Mixtures’,Scandinavian Journal of Statistics36(1), 42–59. eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1111/j.1467-9469.2008.00613.x. URL:https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1467-9469.2008.00613.x

work page doi:10.1111/j.1467-9469.2008.00613.x 2009

[24] [24]

and Lindskog, F

Hult, H. and Lindskog, F. [2006], ‘Regular variation for measures on metric spaces’,Publ. Inst. Math. (Beograd) (N.S.)80(94), 121–140. URL:http://dx.doi.org/10.2298/PIM0694121H

work page doi:10.2298/pim0694121h 2006

[25] [25]

Irvin, J., Rajpurkar, P., Ko, M., Yu, Y., Ciurea-Ilcus, S., Chute, C., Marklund, H., Haghgoo, B., Ball, R., Shpanskaya, K., Lungren, M. P. and Ng, A. Y. [2019], CheXpert: A large chest radiograph dataset with uncertainty labels and expert comparison,in‘Proceedings of the AAAI Conference on Artificial Intelligence’, Vol. 33, pp. 590–597

2019

[26] [26]

and Sabourin, A

JALALZAI, H., Cl´ emen¸ con, S. and Sabourin, A. [2018], On binary classification in extreme regions,inS. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi and R. Gar- nett, eds, ‘Advances in Neural Information Processing Systems’, Vol. 31, Curran Associates, Inc. URL:https: // proceedings. neurips. cc/ paper_ files/ paper/ 2018/ file/ 0ebcc7...

2018

[27] [27]

and Stoev, S

Janßen, A., Neblung, S. and Stoev, S. [2023], ‘Tail-dependence, exceedance sets, and metric embeddings’,Extremes. URL:https://doi.org/10.1007/s10687-023-00471-z

work page doi:10.1007/s10687-023-00471-z 2023

[28] [28]

[2015],Dependence Modeling with Copulas, Chapman & Hall/CRC Monographs on Statistics & Applied Probability, CRC Press, Boca Raton, FL

Joe, H. [2015],Dependence Modeling with Copulas, Chapman & Hall/CRC Monographs on Statistics & Applied Probability, CRC Press, Boca Raton, FL

2015

[29] [29]

and Soulier, P

Kulik, R. and Soulier, P. [2020],Heavy-Tailed Time Series, Springer Series in Operations Research and Financial Engineering, Springer, New York, NY. URL:http://link.springer.com/10.1007/978-1-0716-0737-4

work page doi:10.1007/978-1-0716-0737-4 2020

[30] [30]

Le Cam and G.L

Le Cam, L. and Yang, G. L. [2000],Asymptotics in Statistics, Springer Series in Statistics, second edn, Springer-Verlag, New York. Some basic concepts. /Optimal prediction of extreme events36 URL:https://doi.org/10.1007/978-1-4612-1166-2

work page doi:10.1007/978-1-4612-1166-2 2000

[31] [31]

Lindskog, F., Resnick, S. I. and Roy, J. [2014], ‘Regularly varying measures on metric spaces: hidden regular variation and hidden jumps’,Probab. Surv.11, 270–314. URL:http://dx.doi.org/10.1214/14-PS231

work page doi:10.1214/14-ps231 2014

[32] [32]

McNeil, A. J. and Neˇ slehov´ a, J. [2009], ‘Multivariate Archimedean copulas,d-monotone func- tions andl 1-norm symmetric distributions’,Ann. Statist.37(5B), 3059–3097. URL:https://doi.org/10.1214/07-AOS556

work page doi:10.1214/07-aos556 2009

[33] [33]

and Segers, J

Meinguet, T. and Segers, J. [2012], Regularly varying time series in Banach spaces. Preprint available from arXiv:1001.3262. URL:http://arxiv.org/pdf/1001.3262v1

Pith/arXiv arXiv 2012

[34] [34]

[2006], ‘Quantile regression forests’,Journal of Machine Learning Research 7, 983–999

Meinshausen, N. [2006], ‘Quantile regression forests’,Journal of Machine Learning Research 7, 983–999. URL:https://www.jmlr.org/papers/v7/meinshausen06a.html

2006

[35] [35]

Nguyen, X., Wainwright, M. J. and Jordan, M. I. [2010], ‘Estimating divergence functionals and the likelihood ratio by convex risk minimization’,IEEE Transactions on Information Theory56(11), 5847–5861

2010

[36] [36]

swpc.noaa.gov/products/goes-x-ray-flux

NOAA Space Weather Prediction Center [2023], ‘GOES X-ray Flux Data’,https://www. swpc.noaa.gov/products/goes-x-ray-flux. Accessed: 2025-06-13

2023

[37] [37]

[2002],A user’s guide to measure theoretic probability, Vol

Pollard, D. [2002],A user’s guide to measure theoretic probability, Vol. 8 ofCambridge Series in Statistical and Probabilistic Mathematics, Cambridge University Press, Cambridge

2002

[38] [38]

and Koenker, R

Portnoy, S. and Koenker, R. [1997], ‘The Gaussian hare and the Laplacian tortoise: com- putability of squared-error versus absolute-error estimators’,Statistical Science12(4), 279 – 300. URL:https://doi.org/10.1214/ss/1030037960

work page doi:10.1214/ss/1030037960 1997

[39] [39]

[2007],Heavy-Tail Phenomena: Probabilistic and Statistical Modeling, number v

Resnick, S. [2007],Heavy-Tail Phenomena: Probabilistic and Statistical Modeling, number v. 10in‘Heavy-tail phenomena: probabilistic and statistical modeling’, Springer. URL:https://books.google.com/books?id=p8uq2QFw9PUC

2007

[40] [40]

Resnick, S. I. [1987],Extreme Values, Regular Variation and Point Processes, Springer-Verlag, New York

1987

[41] [41]

Resnick, S. I. [1999],A probability path, Birkh¨ auser Boston Inc., Boston, MA

1999

[42] [42]

Resnick, S. I. [2024],The art of finding hidden risks, Springer, New York. Hidden Regular Variation in the 21st Century

2024

[43] [43]

and Tong, X

Rigollet, P. and Tong, X. [2011], ‘Neyman–Pearson classification, convexity and stochastic constraints’,Journal of Machine Learning Research12, 2831–2855. URL:https://jmlr.org/papers/v12/rigollet11a.html

2011

[44] [44]

and Naveau, P

Sabourin, A. and Naveau, P. [2014], ‘Bayesian Dirichlet mixture model for multivariate ex- tremes: A re-parametrization’,Computational Statistics & Data Analysis71, 542–567

2014

[45] [45]

and Taqqu, M

Samorodnitsky, G. and Taqqu, M. S. [1994],Stable Non-Gaussian Processes: Stochastic Models with Infinite Variance, Chapman and Hall, New York, London

1994

[46] [46]

and Stoev, S

Scheffler, H.-P. and Stoev, S. [2017], ‘Implicit extremes and implicit max-stable laws’,Extremes 20(2), 265–299. URL:https://doi.org/10.1007/s10687-016-0278-9

work page doi:10.1007/s10687-016-0278-9 2017

[47] [47]

Bernstein

Schilling, R. L., Song, R. and Vondracek, Z. [2012],Bernstein Functions: Theory and Appli- cations, De Gruyter, Berlin, Boston. URL:https://doi.org/10.1515/9783110269338

work page doi:10.1515/9783110269338 2012

[48] [48]

[1960], ‘Bivariate extreme statistics

Sibuya, M. [1960], ‘Bivariate extreme statistics. I’,Ann. Inst. Statist. Math. Tokyo11, 195– 210. URL:https://doi.org/10.1007/bf01682329 /Optimal prediction of extreme events37

work page doi:10.1007/bf01682329 1960

[49] [49]

[2026a], ‘optXpred: R code for optimal extreme event prediction via homogoneous functions’

Stoev, S. [2026a], ‘optXpred: R code for optimal extreme event prediction via homogoneous functions’. Private. Ask the author for access. Provides R code and a Shiny app for optimal extreme event prediction via homogoneous functions. URL:https://github.com/cctoeb/optXpred.git

[50] [50]

[2026b], ‘An R Shiny app illustrating the optXpred software for optimal extreme event prediction via homogeneous functions’

Stoev, S. [2026b], ‘An R Shiny app illustrating the optXpred software for optimal extreme event prediction via homogeneous functions’. URL:https://rada.stat.lsa.umich.edu/shiny/sstoev/optXpred/

[51] [51]

and Kanamori, T

Sugiyama, M., Suzuki, T. and Kanamori, T. [2012],Density Ratio Estimation in Machine Learning, 1 edn, Cambridge University Press

2012

[52] [52]

[2013], ‘A plug-in approach to Neyman–Pearson classification’,Journal of Machine Learning Research14(92), 3011–3040

Tong, X. [2013], ‘A plug-in approach to Neyman–Pearson classification’,Journal of Machine Learning Research14(92), 3011–3040. URL:https://jmlr.org/papers/v14/tong13a.html

2013

[53] [53]

and Feng, Y

Tong, X., Xia, L., Wang, J. and Feng, Y. [2020], ‘Neyman–Pearson classification: Parametrics and sample size requirement’,Journal of Machine Learning Research21(12), 1–48. URL:https://jmlr.org/papers/v21/18-577.html

2020

[54] [54]

Tsybakov, A. B. [2009],Introduction to Nonparametric Estimation, Springer series in statistics, Springer, Dordrecht. URL:https://cds.cern.ch/record/1315296

arXiv 2009

[55] [55]

Valavi, R., Guillera-Arroita, G., Lahoz-Monfort, J. J. and Elith, J. [2022], ‘Predictive per- formance of presence-only species distribution models: A benchmark study with reproducible code’,Ecological Monographs92(1), e01486

2022

[56] [56]

van der Vaart, A. W. [1998],Asymptotic statistics, Vol. 3 ofCambridge Series in Statistical and Probabilistic Mathematics, Cambridge University Press, Cambridge

1998

[57] [57]

and Chen, Y

Verma, V., Stoev, S. and Chen, Y. [2024], ‘On the optimal prediction of extreme events in heavy-tailed time series with applications to solar flare forecasting’. URL:https://arxiv.org/abs/2407.11887

arXiv 2024

[58] [58]

Wright, M. N. and Ziegler, A. [2017], ‘ranger: A fast implementation of random forests for high dimensional data in C++ and R’,Journal of Statistical Software77(1), 1–17

2017

[59] [59]

and Jones, M

Yu, K. and Jones, M. C. [1998], ‘Local linear quantile regression’,Journal of the American Statistical Association93(441), 228–237. Supplements A: Examples A.1. Examples of optimal predictors In this section, we elaborate on the closed-form optimal predictors discussed in Section 2. Specifi- cally, we prove the stated characterizations of the optimal pred...

1998

[60] [60]

There exists a positiveα >0such thata n ∼ℓ(n)n 1/α, for some slowly varying functionℓ(·)

[61] [61]

The measureµsatisfies the scaling relation: µ(t·A) =t −αµ(A),for allt >0andA∈ B(τ).(B.3)

[62] [62]

Consequently, the parameterα >0, referred to as the tail exponent ofZis unique, and so is the measureµ, up to a rescaling factor

If alsoZ∈RV({b n}, ν), then an bn →c∈(0,∞),andµ(A) =c −αν(A),for allA∈ B(τ). Consequently, the parameterα >0, referred to as the tail exponent ofZis unique, and so is the measureµ, up to a rescaling factor

[63] [63]

We have thatZ∈RV α(Rk,{a n}, c Z, τ, σ)according to Definition 3.1, where cZ :=µ({τ >1})andσ(B) := 1 µ({τ >1}) µ({x/τ(x)∈B}), B∈ B(S τ)

[64] [64]

The proof of this result can be derived from many excellent treatments in the literature

Conversely, ifZ∈RV α(Rk,{a n}, c Z, τ, σ)according to Definition 3.1, thenZ∈RV α({an}, µ) according to Definition B.1, where µ◦T −1 τ ((r,∞)×B)) =c Zr−ασ(B),for allr >0, B∈ B(S τ),(B.4) and whereT τ :R k \{τ= 0} →(0,∞)×S τ is the generalized polar coordinated homeomorphism defined asT τ(x) = (τ(x), x/τ(x)). The proof of this result can be derived from man...

[65] [65]

quantiles

thatZ:= (Y, X)∈RV 1(R+ ×R d,{a (Z) n :=n}, c Z, τ, σ), where withc i := (bi, ai), we have cZ = pX i=1 τ(c i) = pX i=1 bi +∥a i∥. In this case, the angular measureσisdiscreteand takes the form: σ(dθ) = 1 cZ pX i=1 τ(c i)δci/τ(c i)(dθ).(C.2) We begin with a counterpart to Proposition 3.2 and for the sake of completeness, we provide a proof. Proposition C.1....

2010