Recognition: no theorem link
Penalized estimation of GEV parameters for extreme quantile regression
Pith reviewed 2026-05-15 00:58 UTC · model grok-4.3
The pith
Penalized likelihood estimation of GEV parameters weighted by generalized random forest scores produces stable estimates for extreme conditional quantiles.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Maximizing a penalized likelihood for the parameters of a covariate-dependent generalized extreme value distribution, where the likelihood contributions are weighted according to generalized random forest similarity measures, produces estimators that remain reliable for small samples and asymptotically efficient for large ones.
What carries the argument
The penalized weighted likelihood for GEV parameters, with weights derived from generalized random forests that capture local similarity in the covariate space.
If this is right
- The method permits extrapolation to quantiles beyond the range of observed data by relying on the GEV tail model.
- It accommodates complex, high-dimensional covariate structures without requiring explicit parametric forms for the parameter functions.
- As sample size increases, the estimator converges to the same limit as the unpenalized maximum likelihood estimator.
- Practical performance gains appear in finite-sample settings where standard approaches suffer from high variance or bias in the tails.
Where Pith is reading between the lines
- The same weighting-plus-penalty construction could be transferred to other parametric tail models such as the generalized Pareto distribution for threshold exceedances.
- Adapting the random-forest weights to respect serial or spatial dependence would allow application to time-series or spatial extreme-value problems.
- Direct comparisons against neural-network or other nonparametric regression methods for GEV parameters would clarify whether the forest-plus-penalty combination offers unique finite-sample advantages.
Load-bearing premise
Block maxima of the conditional response distribution follow a generalized extreme value law whose parameters vary smoothly with the covariates, and the generalized random forest weights form a valid scheme for the resulting likelihood.
What would settle it
In repeated small-sample simulations drawn from a known conditional distribution whose block maxima are exactly generalized extreme value, the penalized estimator would show no reduction in mean squared error for high-quantile estimates relative to ordinary maximum-likelihood estimation.
read the original abstract
Quantile regression (QR) relies on the estimation of conditional quantiles and explores the relationships between independent and dependent variables. At high probability levels, classical QR methods face extrapolation difficulties due to the scarcity of data in the tail of the distribution. Another challenge arises when the number of predictors is large and the quantile function exhibits a complex structure. In this work, we propose an estimation method designed to overcome these challenges. To enhance extrapolation in the tail of the conditional response distribution, we model block maxima using the generalized extreme value (GEV) distribution, where the parameters depend on covariates. To address the second challenge, we adopt an approach based on generalized random forests (grf) to estimate these parameters. Specifically, we maximize a penalized likelihood, weighted by the weights obtained through the grf method. This penalization helps overcome the limitations of the maximum likelihood estimator (MLE) in small samples, while preserving its optimality in large samples. The effectiveness of our method is validated through comparisons with other approaches in simulation studies and an application to U.S. wage data.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a method for extreme quantile regression that models block maxima of the conditional response via a generalized extreme value (GEV) distribution whose parameters are covariate-dependent. It estimates these parameters by maximizing a penalized likelihood weighted by generalized random forest (grf) weights, claiming that the penalization mitigates small-sample instability of the MLE while preserving large-sample optimality. Effectiveness is assessed via simulation comparisons and an application to U.S. wage data.
Significance. If the asymptotic equivalence to the efficient MLE can be rigorously established under random grf weighting, the approach would provide a practical bridge between extreme-value theory and machine-learning weighting schemes for high-dimensional tail estimation, potentially improving extrapolation in sparse tail regions.
major comments (3)
- [Abstract] Abstract: the central claim that penalization 'preserves its optimality in large samples' is unsupported; no theorem, expansion of the estimating equations, or rate condition on the penalty parameter is supplied to show that the penalty term vanishes relative to the grf-weighted log-likelihood when the weights are random and data-dependent.
- [Abstract] Abstract: the explicit functional form of the penalty is never stated, nor is any procedure given for selecting the tuning parameter; without these, it is impossible to verify that the estimator remains consistent or to reproduce the reported simulation and real-data results.
- [Abstract] Abstract: no information is provided on how uncertainty (standard errors, confidence intervals) is quantified for the estimated GEV parameters or the resulting extreme quantiles, which is load-bearing for any practical use of the method.
minor comments (1)
- [Abstract] The abstract would benefit from a concise statement of the precise penalty term and the range of sample sizes considered in the simulations.
Simulated Author's Rebuttal
We thank the referee for the careful review and constructive feedback on our manuscript. We agree that the abstract requires substantial clarification and additional technical details. We address each major comment below and will implement the indicated revisions in the next version of the paper.
read point-by-point responses
-
Referee: [Abstract] Abstract: the central claim that penalization 'preserves its optimality in large samples' is unsupported; no theorem, expansion of the estimating equations, or rate condition on the penalty parameter is supplied to show that the penalty term vanishes relative to the grf-weighted log-likelihood when the weights are random and data-dependent.
Authors: We acknowledge that the manuscript provides no formal asymptotic argument establishing that the penalty vanishes relative to the grf-weighted likelihood under random, data-dependent weights. In the revision we will delete the phrase 'preserving its optimality in large samples' from the abstract. We will replace it with the statement that the penalization is introduced to improve finite-sample stability of the GEV MLE while the unpenalized estimator retains its standard consistency properties under the usual regularity conditions for GEV models. A short remark on the penalty rate will be added to the methods section to make this qualification explicit. revision: yes
-
Referee: [Abstract] Abstract: the explicit functional form of the penalty is never stated, nor is any procedure given for selecting the tuning parameter; without these, it is impossible to verify that the estimator remains consistent or to reproduce the reported simulation and real-data results.
Authors: The full manuscript defines the penalty as a quadratic ridge penalty on the three GEV parameters (location, scale, shape) of the form λ‖θ − θ₀‖², where θ₀ is a preliminary unpenalized estimate. The tuning parameter λ is chosen by K-fold cross-validation that maximizes the weighted log-likelihood on held-out blocks. We will insert these explicit expressions and the cross-validation procedure into the revised abstract and expand the corresponding paragraph in Section 3 so that the estimator is fully reproducible. revision: yes
-
Referee: [Abstract] Abstract: no information is provided on how uncertainty (standard errors, confidence intervals) is quantified for the estimated GEV parameters or the resulting extreme quantiles, which is load-bearing for any practical use of the method.
Authors: We agree that uncertainty quantification must be described. In the revision we will add a dedicated subsection (new Section 3.4) that specifies a nonparametric bootstrap procedure: we resample the grf weights together with the block maxima, re-estimate the penalized GEV parameters on each replicate, and obtain percentile confidence intervals for both the parameters and the implied extreme quantiles. The bootstrap will be applied to the simulation experiments and the wage-data application, with results reported in the revised tables and figures. revision: yes
Circularity Check
No significant circularity; method combines established GEV and grf components without self-referential reduction
full rationale
The paper proposes maximizing a grf-weighted penalized likelihood for covariate-dependent GEV parameters. The central claim that penalization overcomes small-sample MLE issues while preserving large-sample optimality relies on standard asymptotic theory for penalized MLEs and the validity of grf weights as a weighting scheme, neither of which reduces by construction to a quantity defined in terms of itself. No equations equate the estimator to a fitted input or rename a known result as a new derivation. Any self-citations (if present) are not load-bearing for the uniqueness or optimality statements, as the approach builds on external GEV and random-forest literature. The derivation chain is self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
free parameters (1)
- penalty parameter
axioms (1)
- domain assumption Block maxima of the conditional response follow a generalized extreme value distribution
Reference graph
Works this paper leans on
-
[1]
Stochastic Hydrology and Hydraulics2(3), 201–212 (1988) https://doi.org/10.1007/BF01550842
Arnell, N.W.: Unbiased estimation of flood risk with the GEV distribution. Stochastic Hydrology and Hydraulics2(3), 201–212 (1988) https://doi.org/10.1007/BF01550842
-
[2]
Econometrica46(1), 33 (1978) https://doi
Koenker, R., Bassett, G.: Regression Quantiles. Econometrica46(1), 33 (1978) https://doi. org/10.2307/1913643
-
[3]
Beyerlein, A.: Quantile regression—opportunities and challenges from a user’s perspective. American journal of epidemiology180(3), 330–331 (2014) 20 Figure D4 Variation of the parametersˆµ(x), ˆσ(x), and ˆξ(x) as a function of age
work page 2014
-
[4]
Econometrica74(2), 539–563 (2006)
Angrist, J., Chernozhukov, V., Fernández-Val, I.: Quantile regression under misspecification, with an application to the US wage structure. Econometrica74(2), 539–563 (2006)
work page 2006
-
[5]
Koenker, R.: Quantile regression: 40 years on. Annual Review of Economics9(2017), 155–176 (2017) https://doi.org/10.1146/annurev-economics-063016-103651
-
[6]
Communications in Statistics-Theory and Methods45(11), 3097– 3113 (2016)
Benziadi, F., Laksaci, A., Tebboune, F.: Recursive kernel estimate of the conditional quantile for functional ergodic data. Communications in Statistics-Theory and Methods45(11), 3097– 3113 (2016)
work page 2016
-
[7]
Frontiers in Ecology and the Environment1(8), 412–420 (2003)
Cade, B.S., Noon, B.R.: A gentle introduction to quantile regression for ecologists. Frontiers in Ecology and the Environment1(8), 412–420 (2003)
work page 2003
-
[8]
Journal of the American Statistical Association 104(487), 1233–1240 (2009)
Wang, H., Tsai, C.-L.: Tail index regression. Journal of the American Statistical Association 104(487), 1233–1240 (2009)
work page 2009
-
[9]
Annals of Statistics, 806–839 (2005)
Chernozhukov, V.: Extremal quantile regression. Annals of Statistics, 806–839 (2005)
work page 2005
-
[10]
Energy Conversion and Management151, 737–752 (2017) https://doi.org/10
Zheng,W.,Peng,X.,Lu,D.,Zhang,D.,Liu,Y.,Lin,Z.,Lin,L.:Compositequantileregression extreme learning machine with feature selection for short-term wind speed forecasting: A new approach. Energy Conversion and Management151, 737–752 (2017) https://doi.org/10. 1016/j.enconman.2017.09.029
work page 2017
-
[11]
Canadian Journal of Statistics50(1), 267–286 (2022)
Zhu,H.,Li,Y.,Liu,B.,Yao,W.,Zhang,R.:Extremequantileestimationforpartialfunctional linear regression models with heavy-tailed distributions. Canadian Journal of Statistics50(1), 267–286 (2022)
work page 2022
-
[12]
Computational Statistics & Data Analysis56(12), 4081–4096 (2012)
Schaumburg, J.: Predicting extreme value at risk: Nonparametric quantile regression with refinements from extreme value theory. Computational Statistics & Data Analysis56(12), 4081–4096 (2012)
work page 2012
-
[13]
Stochastic Environmental Research and Risk Assessment, 1–18 (2022)
Saulo, H., Vila, R., Bittencourt, V.L., Leão, J., Leiva, V., Christakos, G.: On a new extreme 21 value distribution: characterization, parametric quantile regression, and application to ex- treme air pollution events. Stochastic Environmental Research and Risk Assessment, 1–18 (2022)
work page 2022
-
[14]
Empirical Economics 62(1), 7–33 (2022) https://doi.org/10.1007/ s00181-020-01898-0
Chernozhukov, V., Fernández-Val, I., Melly, B.: Fast algorithms for the quantile regression process. Empirical Economics 62(1), 7–33 (2022) https://doi.org/10.1007/ s00181-020-01898-0
work page 2022
-
[15]
Journal of Probability and Statistics2021, 1–10 (2021) https://doi.org/10.1155/2021/6697120
Kithinji, M.M., Mwita, P.N., Kube, A.O.: Adjusted Extreme Conditional Quantile Autore- gression with Application to Risk Measurement. Journal of Probability and Statistics2021, 1–10 (2021) https://doi.org/10.1155/2021/6697120
-
[16]
(eds.): Handbook of Quantile Regression
Koenker, R., Chernozhukov, V., He, X., Peng, L. (eds.): Handbook of Quantile Regression. Chapman and Hall/CRC, New York (2017). https://doi.org/10.1201/9781315120256
-
[17]
Gnecco, N., Terefe, E.M., Engelke, S.: Extremal Random Forests. Journal of the American Statistical Association, 1–24 (2024) https://doi.org/10.1080/01621459.2023.2300522
-
[18]
Stochastic environmental research and risk assessment32, 3207–3225 (2018)
Cannon, A.J.: Non-crossing nonlinear regression quantiles by monotone composite quantile regression neural network, with application to rainfall extremes. Stochastic environmental research and risk assessment32, 3207–3225 (2018)
work page 2018
-
[19]
Chaudhuri, P., Loh, W.-Y.: Nonparametric estimation of conditional quantiles using quantile regression trees. Bernoulli, 561–576 (2002)
work page 2002
-
[20]
Journal of machine learning research 7(6) (2006)
Meinshausen, N., Ridgeway, G.: Quantile regression forests. Journal of machine learning research 7(6) (2006)
work page 2006
-
[21]
Journal of human resources, 88–126 (1998)
Buchinsky, M.: Recent advances in quantile regression models: a practical guideline for empirical research. Journal of human resources, 88–126 (1998)
work page 1998
-
[22]
The Annals of Statistics 47(2), 1148–1178 (2019) https://doi.org/10.1214/18-AOS1709
Athey, S., Tibshirani, J., Wager, S.: Generalized random forests. The Annals of Statistics 47(2), 1148–1178 (2019) https://doi.org/10.1214/18-AOS1709
-
[23]
Journal of Machine Learning Research22, 111–111138 (2020)
Ye, S.S., Padilla, O.H.M.: Non-parametric quantile regression via the k-nn fused lasso. Journal of Machine Learning Research22, 111–111138 (2020)
work page 2020
-
[24]
Earth and Space Science9(11), 2022–002571 (2022)
Yao, L., Lu, J., Zhang, W., Qin, J., Zhou, C., Tran, N.N., Pinagé, E.R.: Spatiotemporal Anal- ysis of Extreme Temperature Change on the Tibetan Plateau Based On Quantile Regression. Earth and Space Science9(11), 2022–002571 (2022)
work page 2022
-
[25]
Journal of Hydrology577, 123957 (2019)
Tyralis, H., Papacharalampous, G., Burnetas, A., Langousis, A.: Hydrological post-processing using stacked generalization of quantile regression algorithms: Large-scale application over CONUS. Journal of Hydrology577, 123957 (2019)
work page 2019
-
[26]
Youngman, B.D.: Generalized Additive Models for Exceedances of High Thresholds With an Application to Return Level Estimation for U.S. Wind Gusts. Journal of the American Statistical Association114(528), 1865–1879 (2019)
work page 2019
-
[27]
Extremes (2023) https://doi.org/10.1007/s10687-023-00473-x
Velthoen, J., Dombry, C., Cai, J.-J., Engelke, S.: Gradient boosting for extreme quantile regression. Extremes (2023) https://doi.org/10.1007/s10687-023-00473-x
-
[28]
The Annals of Applied Statistics18(4), 2818–2839 (2024)
Pasche, O.C., Engelke, S.: Neural networks for extreme quantile regression with an application to forecasting of flood risk. The Annals of Applied Statistics18(4), 2818–2839 (2024)
work page 2024
-
[29]
Vidagbandji, L.M., Berred, A., Bertelle, C., Amanton, L.: Generalized random forest for extreme quantile regression. Communications in Statistics - Simulation and Computation 0(0), 1–24 (2025) https://doi.org/10.1080/03610918.2025.2543854 22
-
[30]
Springer Series in Operations Research and Fi- nancial Engineering
Haan, L., Ferreira, A.: Extreme Value Theory. Springer Series in Operations Research and Fi- nancial Engineering. Springer, New York, NY (2006). https://doi.org/10.1007/0-387-34471-3
-
[31]
Coles, S.: An Introduction to Statistical Modeling of Extreme Values. Springer Series in Statistics. Springer, London (2001). https://doi.org/10.1007/978-1-4471-3675-0
-
[32]
Extremes 2(1), 5–23 (1999) https://doi.org/10.1023/A:1009905222644
Coles, S.G., Dixon, M.J.: Likelihood-Based Inference for Extreme Value Models. Extremes 2(1), 5–23 (1999) https://doi.org/10.1023/A:1009905222644
-
[33]
Fisher, R.A., Tippett, L.H.C.: Limiting forms of the frequency distribution of the largest or smallest member of a sample. Mathematical Proceedings of the Cambridge Philosophical Society 24(2), 180–190 (1928) https://doi.org/10.1017/S0305004100015681
-
[34]
Annals of Mathematics44(3), 423–453 (1943) https://doi.org/10.2307/1968974
Gnedenko, B.: Sur La Distribution Limite Du Terme Maximum D’Une Serie Aleatoire. Annals of Mathematics44(3), 423–453 (1943) https://doi.org/10.2307/1968974
-
[35]
Non-Stationary Analysis of Extreme Rainfall in African Test Cities
De Paola, F., Giugni, M., Pugliese, F., Annis, A., Nardi, F.: GEV Parameter Estimation and Stationary vs. Non-Stationary Analysis of Extreme Rainfall in African Test Cities. Hydrology 5(2), 28 (2018) https://doi.org/10.3390/hydrology5020028
-
[36]
Extremes 20(4), 839–872 (2017) https://doi.org/10.1007/ s10687-017-0292-6
Bücher, A., Segers, J.: On the maximum likelihood estimator for the Generalized Extreme-Value distribution. Extremes 20(4), 839–872 (2017) https://doi.org/10.1007/ s10687-017-0292-6
work page 2017
-
[37]
Bernoulli21(1), 420–436 (2015)
Dombry, C.: Existence and consistency of the maximum likelihood estimators for the extreme value index within the block maxima framework. Bernoulli21(1), 420–436 (2015)
work page 2015
-
[38]
Bernoulli 25(3), 1690–1723 (2019) https://doi.org/10.3150/18-BEJ1032
Dombry, C., Ferreira, A.: Maximum likelihood estimators based on the block maxima method. Bernoulli 25(3), 1690–1723 (2019) https://doi.org/10.3150/18-BEJ1032
-
[39]
Machine learning45, 5–32 (2001)
Breiman, L.: Random forests. Machine learning45, 5–32 (2001)
work page 2001
-
[40]
Journal of Statistical Software103(3), 1–26 (2022) https://doi.org/10.18637/jss.v103.i03
Youngman, B.D.: evgam: An r package for generalized additive extreme value models. Journal of Statistical Software103(3), 1–26 (2022) https://doi.org/10.18637/jss.v103.i03
-
[41]
Communications of the ACM7(12), 701–702 (1964) https://doi.org/10.1145/355588.365104
Halton, J.H.: Algorithm 247: Radical-inverse quasi-random point sequence. Communications of the ACM7(12), 701–702 (1964) https://doi.org/10.1145/355588.365104
-
[42]
Wang, H.J., Li, D.: Estimation of Extreme Conditional Quantiles Through Power Trans- formation. Journal of the American Statistical Association 108(503), 1062–1074 (2013) https://doi.org/10.1080/01621459.2013.820134
-
[43]
Springer Series in Statistics, vol
Hastie, T.: The Elements of Statistical Learning, Second edition edn. Springer Series in Statistics, vol. 2. Springer, New York, NY (2017)
work page 2017
-
[44]
Mathematical Problems in Engineering 2016, 1–9 (2016) 23
Wang, J., You, S., Wu, Y., Zhang, Y., Bin, S.: A Method of Selecting the Block Size of BMM for Estimating Extreme Loads in Engineering Vehicles. Mathematical Problems in Engineering 2016, 1–9 (2016) 23
work page 2016
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.