On the Dirichlet-kernel Gasser--M\"uller estimator and its competitors for fixed design regression on the simplex
Pith reviewed 2026-05-23 04:09 UTC · model grok-4.3
The pith
A Dirichlet-kernel Gasser-Müller estimator for fixed-design regression on the simplex has explicit pointwise bias, variance, asymptotic normality, and mean integrated squared error.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The Dirichlet-kernel Gasser-Müller estimator is constructed by weighting the observed responses with a Dirichlet kernel centered at the target point inside the simplex. Its pointwise bias admits an expansion of order equal to the bandwidth, its variance admits an expansion of order 1 over sample size times bandwidth to the power of the dimension, the properly centered and scaled estimator converges in distribution to a normal random variable, and the mean integrated squared error admits an explicit asymptotic expansion whose leading terms can be minimized with respect to the bandwidth.
What carries the argument
The Dirichlet-kernel Gasser-Müller estimator, a weighted average of responses that replaces the usual kernel with a Dirichlet kernel satisfying the usual moment and positivity conditions on the simplex.
If this is right
- The bias and variance expansions yield the optimal bandwidth rate that balances the two terms in the mean integrated squared error.
- Asymptotic normality supplies the limiting distribution needed to form pointwise approximate confidence intervals at interior points.
- The explicit mean integrated squared error formula permits analytic comparison of asymptotic efficiency among the three Dirichlet-kernel estimators.
- The simulation ranking indicates that the local-linear version should be used in preference to the Gasser-Müller version for data sets of moderate size.
Where Pith is reading between the lines
- The consistent underperformance of the Gasser-Müller weights relative to local-linear weights on the simplex suggests that the weighting scheme itself, rather than the kernel family, is the dominant source of the difference.
- The real-data illustration on compositional predictors implies that any of the three estimators can be applied directly to problems in which the covariates are constrained to sum to one.
- Replacing the fixed design by a random design drawn from a Dirichlet distribution would require only minor changes to the bias and variance derivations already obtained.
Load-bearing premise
The fixed design points lie on the simplex and the Dirichlet kernel satisfies the moment and positivity conditions needed for the bias and variance expansions to hold.
What would settle it
A Monte Carlo experiment on a known twice-differentiable regression function on the simplex with sample sizes growing to several thousand in which the empirical distribution of the normalized estimator fails to approach a normal limit would falsify the asymptotic normality claim.
Figures
read the original abstract
A Dirichlet-kernel Gasser-M\"uller (D-GM) estimator is introduced for fixed design regression on the simplex, extending the univariate analog due to Chen [Statist. Sinica, vol. 10(1) (2000), pp. 73-91]. Its pointwise bias and variance, asymptotic normality, and mean integrated squared error are investigated. Some simulation experiments are conducted to compare its small-sample performance with that of two recently proposed alternatives: the Dirichlet-kernel Nadaraya-Watson (D-NW) and local linear (D-LL) estimators. The simulation results reveal that the D-LL estimator is best among the D-LL, D-NW, and D-GM estimators and that the proposed D-GM estimator is worst. A real data analysis is also reported for the GEMAS dataset to analyze the relationship between soil composition and pH levels across various agricultural and grazing lands in Europe.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces a Dirichlet-kernel Gasser-Müller (D-GM) estimator for fixed-design nonparametric regression on the simplex, extending Chen (2000). It derives pointwise bias, variance, asymptotic normality, and MISE; conducts simulations comparing D-GM to D-NW and D-LL estimators (finding D-LL best and D-GM worst); and applies the methods to the GEMAS soil-composition dataset for pH modeling.
Significance. If the asymptotic expansions hold, the work supplies a new fixed-design estimator for regression on compositional data and supplies simulation evidence favoring local-linear over Gasser-Müller and Nadaraya-Watson versions on the simplex. The inclusion of both theoretical derivations and a real-data illustration is a positive feature.
major comments (2)
- [Sections containing the bias/variance and MISE theorems] The bias, variance, normality, and MISE derivations extend Chen (2000) but rest on the Dirichlet kernel satisfying the standard moment conditions (integral equals 1, appropriate first-moment vanishing for bias order h, positivity, and tail decay) when integrated against the fixed-design measure on the simplex. The manuscript states the extension without re-deriving or explicitly verifying these integrals under the simplex geometry and boundary behavior; this verification is load-bearing for all four theoretical results.
- [Simulation section] The simulation design and results (D-GM worst) are consistent with the possibility that the kernel conditions fail to transfer directly, yet the paper offers no diagnostic (e.g., numerical check of the kernel moments on the simplex) that would confirm or refute the source of the performance gap.
minor comments (1)
- Notation for the simplex, the fixed-design points, and the precise definition of the Dirichlet kernel should be introduced with a dedicated preliminary subsection before the estimator is defined.
Simulated Author's Rebuttal
We thank the referee for the careful reading and for identifying these key points about the theoretical assumptions and the simulation diagnostics. We address each major comment below.
read point-by-point responses
-
Referee: [Sections containing the bias/variance and MISE theorems] The bias, variance, normality, and MISE derivations extend Chen (2000) but rest on the Dirichlet kernel satisfying the standard moment conditions (integral equals 1, appropriate first-moment vanishing for bias order h, positivity, and tail decay) when integrated against the fixed-design measure on the simplex. The manuscript states the extension without re-deriving or explicitly verifying these integrals under the simplex geometry and boundary behavior; this verification is load-bearing for all four theoretical results.
Authors: We agree that the manuscript does not contain an explicit re-derivation or numerical/symbolic verification of the moment conditions for the Dirichlet kernel under the fixed-design measure on the simplex, including boundary effects. While the structure follows Chen (2000), this verification is indeed necessary to rigorously support the bias, variance, normality, and MISE results. In the revised manuscript we will add a new subsection (or appendix) that explicitly verifies the required integrals: the kernel integrates to 1, the first-moment condition holds at the appropriate order in h, positivity is preserved, and the tail decay is sufficient, all with respect to the simplex geometry and the fixed-design points. revision: yes
-
Referee: [Simulation section] The simulation design and results (D-GM worst) are consistent with the possibility that the kernel conditions fail to transfer directly, yet the paper offers no diagnostic (e.g., numerical check of the kernel moments on the simplex) that would confirm or refute the source of the performance gap.
Authors: We concur that the observed ranking (D-LL best, D-GM worst) could be consistent with the moment conditions not transferring directly, and that the absence of a diagnostic leaves this possibility unexamined. We will add, in the revised simulation section, a numerical check that evaluates the relevant kernel moments (integral, first moment, etc.) when integrated against the empirical fixed-design measure on the simplex for the bandwidths and sample sizes used in the experiments. This diagnostic will be reported alongside the existing simulation results. revision: yes
Circularity Check
No significant circularity; derivations are extensions of external prior work
full rationale
The paper extends Chen (2000) univariate construction to the simplex case and derives bias, variance, asymptotic normality, and MISE for the D-GM estimator. No quoted steps reduce by construction to fitted inputs, self-definitions, or load-bearing self-citations. The Chen citation is external and independent. Concerns about moment conditions on the simplex are validity issues, not circularity. This matches the default expectation of a self-contained extension.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
R. J. Adler and J. E. Taylor. Random F ields and G eometry . Springer Monographs in Mathematics. Springer, New York, 2007. ISBN 978-0-387-48112-8. MR2319516 http://www.ams.org/mathscinet-getitem?mr=MR2319516
work page 2007
- [2]
- [3]
- [4]
-
[5]
T. Bouezmarni and J. V. K. Rombouts. Nonparametric density estimation for multivariate bounded data. J. Statist. Plann. Inference, 140 0 (1): 0 139--152, 2010. MR2568128 http://www.ams.org/mathscinet-getitem?mr=MR2568128
work page 2010
-
[6]
S. Bouzebda, A. Nezzal, and I. Elhattab. Limit theorems for nonparametric conditional U -statistics smoothed by asymmetric kernels. AIMS Math., 9 0 (9): 0 26195--26282, 2024. MR4796622 http://www.ams.org/mathscinet-getitem?mr=MR4796622
work page 2024
-
[7]
B. M. Brown and S. X. Chen. Beta- B ernstein smoothing for regression curves with compact support. Scand. J. Statist., 26 0 (1): 0 47--59, 1999. MR1685301 http://www.ams.org/mathscinet-getitem?mr=MR1685301
work page 1999
-
[8]
S. X. Chen. Beta kernel estimators for density functions. Comput. Statist. Data Anal., 31 0 (2): 0 131--145, 1999. MR1718494 http://www.ams.org/mathscinet-getitem?mr=MR1718494
work page 1999
-
[9]
S. X. Chen. Beta kernel smoothers for regression curves. Statist. Sinica, 10 0 (1): 0 73--91, 2000. MR1742101 http://www.ams.org/mathscinet-getitem?mr=MR1742101
work page 2000
-
[10]
S. X. Chen. Local linear smoothers using asymmetric kernels. Ann. Inst. Statist. Math., 54 0 (2): 0 312--323, 2002. MR1910175 http://www.ams.org/mathscinet-getitem?mr=MR1910175
work page 2002
- [11]
-
[12]
W. S. Cleveland. Robust locally weighted regression and smoothing scatterplots. J. Amer. Statist. Assoc., 74 0 (368): 0 829--836, 1979. MR556476 http://www.ams.org/mathscinet-getitem?mr=MR556476
work page 1979
- [13]
-
[14]
L. Devroye, L. Gy\" o rfi, G. Lugosi, and H. Walk. On the measure of V oronoi cells. J. Appl. Probab., 54 0 (2): 0 394--408, 2017. MR3668473 http://www.ams.org/mathscinet-getitem?mr=MR3668473
work page 2017
-
[15]
J. Fan. Design-adaptive nonparametric regression. J. Amer. Statist. Assoc., 87 0 (420): 0 998--1004, 1992. MR1209561 http://www.ams.org/mathscinet-getitem?mr=MR1209561
work page 1992
-
[16]
J. Fan. Local linear regression smoothers and their minimax efficiencies. Ann. Statist., 21 0 (1): 0 196--216, 1993. MR1212173 http://www.ams.org/mathscinet-getitem?mr=MR1212173
work page 1993
- [17]
- [18]
-
[19]
B. Funke and M. Hirukawa. Bias correction for local linear regression estimation using asymmetric kernels via the skewing method. Econom. Stat., 20 0 (C): 0 109--130, 2021. MR4302589 http://www.ams.org/mathscinet-getitem?mr=MR4302589
work page 2021
-
[20]
B. Funke and M. Hirukawa. On uniform consistency of nonparametric estimators smoothed by the gamma kernel. Preprint, page 29 pp., 2024
work page 2024
-
[21]
T. Gasser and H.-G. M\" u ller. Kernel estimation of regression functions. In Smoothing techniques for curve estimation ( P roc. W orkshop, H eidelberg, 1979) , volume 757 of Lecture Notes in Math., pages 23--68. Springer, Berlin, 1979. MR564251 http://www.ams.org/mathscinet-getitem?mr=MR564251
work page 1979
-
[22]
T. Gasser, H.-G. M\" u ller, and V. Mammitzsch. Kernels for nonparametric curve estimation. J. Roy. Statist. Soc. Ser. B, 47 0 (2): 0 238--252, 1985. MR564251 http://www.ams.org/mathscinet-getitem?mr=MR564251
work page 1985
-
[23]
C. Genest and F. Ouimet. Local linear smoothing for regression surfaces on the simplex using D irichlet kernels. Preprint, page 20 pp., 2024. arXiv:2408.07209 https://arxiv.org/abs/2408.07209
-
[24]
I. Gibbs and L. Chen. Asymptotic properties of random V oronoi cells with arbitrary underlying density. Adv. in Appl. Probab., 52 0 (2): 0 655--680, 2020. MR4123649 http://www.ams.org/mathscinet-getitem?mr=MR4123649
work page 2020
-
[25]
M. Hirukawa, I. Murtazashvili, and A. Prokhorov. Uniform convergence rates for nonparametric estimators smoothed by the beta kernel. Scand. J. Stat., 49 0 (3): 0 1353--1382, 2022. ISSN 0303-6898,1467-9469. MR4471289 http://www.ams.org/mathscinet-getitem?mr=MR4471289
work page 2022
-
[26]
M. Hirukawa, I. Murtazashvili, and A. Prokhorov. Yet another look at the omitted variable bias. Econometric Rev., 42 0 (1): 0 1--27, 2023. ISSN 0747-4938,1532-4168. MR4556820 http://www.ams.org/mathscinet-getitem?mr=MR4556820
work page 2023
-
[27]
M. C. Jones. Simple boundary correction for kernel density estimation. Stat Comput., 3: 0 135--146, 1993. doi:10.1007/BF00147776. doi:10.1007/BF00147776 https://www.doi.org/10.1007/BF00147776
-
[28]
V. Ya. Katkovnik. Linear and nonlinear methods of nonparametric regression analysis. Avtomatika, 0 (5): 0 35--46, 93, 1979. MR582402 http://www.ams.org/mathscinet-getitem?mr=MR582402
work page 1979
-
[29]
C. C. Kokonendji and S. M. Som\' e . On multivariate associated kernels to estimate general density functions. J. Korean Statist. Soc., 47 0 (1): 0 112--126, 2018. MR3760293 http://www.ams.org/mathscinet-getitem?mr=MR3760293
work page 2018
-
[30]
H.-G. M\" u ller. Nonparametric R egression A nalysis of L ongitudinal D ata , volume 46 of Lecture Notes in Statistics. Springer-Verlag, Berlin, 1988. ISBN 3-540-96844-X. MR960887 http://www.ams.org/mathscinet-getitem?mr=MR960887
work page 1988
-
[31]
H.-G. M\" u ller. Smooth optimum kernel estimators near endpoints. Biometrika, 78 0 (3): 0 521--530, 1991. MR1130920 http://www.ams.org/mathscinet-getitem?mr=MR1130920
work page 1991
-
[32]
H.-G. M\" u ller. Surface and function approximation with nonparametric regression. Rend. Sem. Mat. Fis. Milano, 63: 0 171--211 (1995), 1993. MR1369600 http://www.ams.org/mathscinet-getitem?mr=MR1369600
work page 1995
-
[33]
H.-G. M\" u ller and K. A. Prewitt. Multiparameter bandwidth processes and adaptive surface smoothing. J. Multivariate Anal., 47 0 (1): 0 1--21, 1993. MR1239102 http://www.ams.org/mathscinet-getitem?mr=MR1239102
work page 1993
-
[34]
\`E. A. Nadaraja. On a regression estimate. Teor. Verojatnost. i Primenen., 9: 0 157--159, 1964. MR166874 http://www.ams.org/mathscinet-getitem?mr=MR166874
work page 1964
-
[35]
F. Ouimet. A symmetric matrix-variate normal local approximation for the W ishart distribution and some applications. J. Multivariate Anal., 189: 0 Paper No. 104923, 17 pp., 2022. MR4358612 http://www.ams.org/mathscinet-getitem?mr=MR4358612
work page 2022
-
[36]
F. Ouimet and R. Tolosana-Delgado. Asymptotic properties of D irichlet kernel density estimators. J. Multivariate Anal., 187: 0 Paper No. 104832, 25 pp., 2022. MR4319409 http://www.ams.org/mathscinet-getitem?mr=MR4319409
work page 2022
-
[37]
M. B. Priestley and M. T. Chao. Non-parametric function fitting. J. Roy. Statist. Soc. Ser. B, 34, 1972. MR331616 http://www.ams.org/mathscinet-getitem?mr=MR331616
work page 1972
-
[38]
C. Reimann, P. Filzmoser, K. Fabian, K. Hron, M. Birke, A. Demetriades, E. Dinelli, A. Ladenberger, and The GEMAS Project Team. The concept of compositional data analysis in practice -- T otal major element concentrations in agricultural and grazing land soils of E urope. Sci. Total Environ., 426: 0 196--210, 2012. doi:10.1016/j.scitotenv.2012.02.032 http...
-
[39]
D. Ruppert and M. P. Wand. Multivariate locally weighted least squares regression. Ann. Statist., 22 0 (3): 0 1346--1370, 1994. MR1311979 http://www.ams.org/mathscinet-getitem?mr=MR1311979
work page 1994
- [40]
-
[41]
B. W. Silverman. Density E stimation for S tatistics and D ata A nalysis . Monographs on Statistics and Applied Probability. Chapman & Hall, London, 1986. ISBN 0-412-24620-1. MR848134 http://www.ams.org/mathscinet-getitem?mr=MR848134
work page 1986
-
[42]
S. M. Som\' e and C. C. Kokonendji. Effects of associated kernels in nonparametric multiple regressions. J. Stat. Theory Pract., 10 0 (2): 0 456--471, 2016. MR3499725 http://www.ams.org/mathscinet-getitem?mr=MR3499725
work page 2016
-
[43]
U. Stadtm\" u ller. Asymptotic properties of nonparametric curve estimates. Period. Math. Hungar., 17 0 (2): 0 83--108, 1986. MR858109 http://www.ams.org/mathscinet-getitem?mr=MR858109
work page 1986
-
[44]
C. J. Stone. Consistent nonparametric regression. Ann. Statist., 5 0 (4): 0 595--645, 1977. MR443204 http://www.ams.org/mathscinet-getitem?mr=MR443204
work page 1977
-
[45]
C. J. Stone. Optimal rates of convergence for nonparametric estimators. Ann. Statist., 8 0 (6): 0 1348--1360, 1980. MR594650 http://www.ams.org/mathscinet-getitem?mr=MR594650
work page 1980
-
[46]
C. J. Stone. Optimal global rates of convergence for nonparametric regression. Ann. Statist., 10 0 (4): 0 1040--1053, 1982. MR673642 http://www.ams.org/mathscinet-getitem?mr=MR673642
work page 1982
- [47]
- [48]
-
[49]
G. S. Watson. Smooth regression analysis. Sankhy\= a Ser. A , 26: 0 359--372, 1964. MR185765 http://www.ams.org/mathscinet-getitem?mr=MR185765
work page 1964
-
[50]
S. Zhang and R. J. Karunamuni. On kernel density estimation near endpoints. J. Statist. Plann. Inference, 70 0 (2): 0 301--316, 1998. MR1649872 http://www.ams.org/mathscinet-getitem?mr=MR1649872
work page 1998
-
[51]
S. Zhang and R. J. Karunamuni. On nonparametric density estimation at the boundary. J. Nonparametr. Statist., 12 0 (2): 0 197--221, 2000. MR1752313 http://www.ams.org/mathscinet-getitem?mr=MR1752313
work page 2000
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.