Tuning free Catoni type joint robust estimation

Jun S. Liu; Lihu Xu; Qiang Sun; Xiang Li

arxiv: 2511.11054 · v2 · pith:64PTCJN7new · submitted 2025-11-14 · 🧮 math.ST · stat.TH

Tuning free Catoni type joint robust estimation

Xiang Li , Jun S. Liu , Qiang Sun , Lihu Xu This is my paper

Pith reviewed 2026-05-21 20:07 UTC · model grok-4.3

classification 🧮 math.ST stat.TH

keywords heavy-tailed estimationCatoni estimatorjoint estimationrobust statisticsmean estimationlinear regressionnon-asymptotic boundsPoincaré-Miranda theorem

0 comments

The pith

A system of two coupled Catoni-type equations estimates both a parameter and its unknown variance at sub-Gaussian rates under heavy tails, without tuning.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper constructs a joint estimation procedure that solves two linked Catoni-type equations to recover the target parameter and the noise variance at the same time. This is done for mean estimation, linear regression, and penalized regression under the sole assumption that the noise has a finite moment of order 2β for β between 1 and 2. The resulting non-asymptotic deviation bounds for both quantities match the rates that would be available if the variance were known in advance. Because the equations are non-convex, the analysis replaces standard convex M-estimation tools with an application of the Poincaré-Miranda theorem to guarantee the existence of suitable solutions and control their joint error.

Core claim

The central claim is that the coupled system of two Catoni-type estimating equations admits solutions whose joint deviation from the true parameter and true variance satisfies sub-Gaussian-type bounds under a finite 2β-moment condition with β∈(1,2], with rates that match those of oracle procedures knowing the variance in advance.

What carries the argument

The pair of coupled, non-convex Catoni-type estimating equations for the parameter and the variance, whose joint solutions are controlled via the Poincaré-Miranda theorem.

If this is right

The same joint rates hold in mean estimation, linear regression, and ℓ2-penalized regression.
The bounds remain valid without knowledge of the variance or any tuning parameters.
The rates are optimal up to absolute constants in the heavy-tailed regime.
The proof strategy applies to other problems that require simultaneous estimation of parameters of different types.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same topological control might simplify proofs for joint robust estimation in generalized linear models.
Practitioners facing data with unknown scale could replace separate variance estimation and cross-validation steps with this single procedure.
The moment condition 2β with β close to 1 suggests the method remains useful even when tails are only slightly heavier than Gaussian.

Load-bearing premise

The non-convex coupled equations must possess solutions whose joint deviations can be bounded using a topological theorem instead of convexity arguments.

What would settle it

Generate data with exactly 2.1 moments and check whether the observed joint deviation of the estimator from the true parameter and variance exceeds the claimed sub-Gaussian bound by more than a small constant factor.

read the original abstract

This paper develops a Catoni-type joint (tuning-free) estimation framework for parametric models with heavy-tailed noise, in which the target parameter and the unknown noise variance are estimated simultaneously through a system of two coupled Catoni-type estimating equations. We instantiate the framework in three canonical settings: mean estimation, linear regression, and $\ell_{2}$-penalized regression. Theoretically, we establish non-asymptotic, sub-Gaussian-type deviation bounds that hold jointly for the target parameter and the variance estimator, under only a finite $2\beta$-th moment assumption with $\beta\in (1,2]$. The resulting rates match -- up to absolute constants -- those of oracle procedures that know the variance in advance, thereby attaining optimality in the heavy-tailed regime. Methodologically, because the coupled equations are intrinsically non-convex and non-linear, classical convex M-estimation arguments are inapplicable. We develop a new analytical toolkit based on the Poincare--Miranda theorem. The resulting proof strategy is of independent methodological interest, and we expect it to be applicable to a broad class of other statistical problems in which several parameters of heterogeneous nature must be estimated jointly.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a tuning-free joint Catoni estimator for parameter and variance in heavy-tailed models via coupled equations and uses Poincaré-Miranda to get oracle-matching rates, but the analysis only guarantees existence inside the target region.

read the letter

This paper's main contribution is a joint estimation procedure for a target parameter and the noise variance using two coupled Catoni-type equations that require no tuning. They show these achieve oracle rates under only 2β moments for β in (1,2], and they use the Poincaré-Miranda theorem to establish existence of solutions in the right region for the non-convex system in mean estimation, linear regression, and penalized regression cases.

Referee Report

2 major / 1 minor

Summary. The paper develops a tuning-free joint robust estimation framework for parametric models with heavy-tailed noise, simultaneously estimating the target parameter and unknown noise variance via a system of two coupled Catoni-type estimating equations. It instantiates the approach for mean estimation, linear regression, and ℓ₂-penalized regression. The central theoretical claim is the derivation of non-asymptotic sub-Gaussian-type joint deviation bounds under only a finite 2β-th moment assumption (β ∈ (1,2]), with rates matching those of oracle procedures that know the variance in advance. The proofs rely on the Poincaré-Miranda theorem to establish existence of solutions to the non-convex system, bypassing classical convex M-estimation arguments.

Significance. If the central claims hold, the work would provide a valuable contribution to robust statistics by delivering optimal, variance-adaptive estimators under weak moment conditions without tuning parameters. The methodological innovation of adapting Poincaré-Miranda for joint non-convex estimation could extend to other problems involving heterogeneous parameters, and the oracle-matching rates under 2β moments represent a strong theoretical achievement.

major comments (2)

[Proofs of the main deviation bounds (theorems establishing joint sub-Gaussian rates)] The proof strategy invokes Poincaré-Miranda to guarantee a zero of the coupled estimating functions inside a rectangle whose dimensions are set to the target deviation rates. However, the theorem only ensures existence within the rectangle once opposing sign conditions hold on the faces; it provides no control over possible additional zeros outside the rectangle. Under the stated polynomial integrability of the fluctuation terms, far-field behavior is not automatically dominated, so the non-asymptotic bounds may apply only to some solutions rather than to every solution of the system. This affects the well-definedness of the estimator and the validity of the joint deviation claim.
[Section 2 (definition of the joint estimators) and the subsequent theoretical analysis] The estimators are defined as solutions to the coupled non-convex, non-linear equations. Without an additional argument showing that no solutions exist outside the target deviation ball (e.g., via uniform domination of the expectation term by the fluctuation term for large deviations), or a constructive selection rule for the solution inside the rectangle, it remains unclear which root is being bounded and how the procedure is implemented in practice.

minor comments (1)

[Section 2] Notation for the Catoni function and the coupled equations could be made more explicit when first introduced to aid readability for readers unfamiliar with the original Catoni estimator.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thorough review and valuable comments on our manuscript. We appreciate the recognition of the potential contributions and address the major comments point by point below. We plan to make revisions to clarify the well-definedness of the estimators.

read point-by-point responses

Referee: The proof strategy invokes Poincaré-Miranda to guarantee a zero of the coupled estimating functions inside a rectangle whose dimensions are set to the target deviation rates. However, the theorem only ensures existence within the rectangle once opposing sign conditions hold on the faces; it provides no control over possible additional zeros outside the rectangle. Under the stated polynomial integrability of the fluctuation terms, far-field behavior is not automatically dominated, so the non-asymptotic bounds may apply only to some solutions rather than to every solution of the system. This affects the well-definedness of the estimator and the validity of the joint deviation claim.

Authors: We thank the referee for highlighting this important subtlety. The Poincaré-Miranda theorem is invoked solely to guarantee existence of at least one solution inside the target rectangle. We agree that this does not automatically rule out other zeros outside the rectangle. In the revision we will explicitly define the joint estimator as any solution lying inside the rectangle whose existence is assured by the theorem. The deviation bounds are then stated for this defined estimator. A short remark will be added noting that the selection is by construction inside the region of interest. This resolves the well-definedness issue while remaining faithful to the minimal moment assumptions. revision: yes
Referee: The estimators are defined as solutions to the coupled non-convex, non-linear equations. Without an additional argument showing that no solutions exist outside the target deviation ball (e.g., via uniform domination of the expectation term by the fluctuation term for large deviations), or a constructive selection rule for the solution inside the rectangle, it remains unclear which root is being bounded and how the procedure is implemented in practice.

Authors: We agree that the current wording leaves ambiguity about which root is intended. We will revise Section 2 to state that the estimator is defined to be a solution of the coupled system that lies inside the rectangle for which existence is guaranteed by Poincaré-Miranda. For implementation we will add a brief discussion indicating that the low-dimensional (two-equation) system can be solved numerically by standard methods such as Newton iteration or a merit-function minimization, initialized at a point scaled to the target deviation rates. This makes both the theoretical object and the practical procedure unambiguous. revision: yes

Circularity Check

0 steps flagged

Derivation chain is self-contained; no reductions to inputs by construction

full rationale

The estimators are defined directly as solutions to the coupled Catoni-type equations. Non-asymptotic joint deviation bounds are then derived relative to oracle procedures that know the variance, under explicit 2β-moment assumptions with β∈(1,2]. The proof invokes the Poincaré-Miranda theorem to establish existence inside a target rectangle whose dimensions are set to the claimed rates; this is a standard existence argument applied to the estimating functions and does not presuppose the target bounds or rename fitted quantities as predictions. No self-citations are load-bearing for the central claims, no ansatz is smuggled, and no known empirical pattern is merely relabeled. The derivation therefore remains independent of its own outputs.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The framework rests on a standard moment condition and an analytic theorem; no free parameters are introduced because the method is tuning-free.

axioms (2)

domain assumption Finite 2β-th moment assumption for β in (1,2]
Invoked to obtain the sub-Gaussian-type deviation bounds for the joint estimators.
domain assumption Poincaré-Miranda theorem applies to the coupled estimating equations
Used to establish existence and control the non-convex non-linear system.

pith-pipeline@v0.9.0 · 5732 in / 1420 out tokens · 34822 ms · 2026-05-21T20:07:36.945345+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

Foundation/AlexanderDuality.lean alexander_duality_circle_linking echoes

?

echoes
ECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.

We employ the Poincaré–Miranda Theorem to show that the solutions lie within certain geometric regions, such as cylinders or cones, centered around the true parameter values.
Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

ψ1 satisfying -log(1-x+|x|²/2) ≤ ψ1(x) ≤ log(1+x+|x|²/2)

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

48 extracted references · 48 canonical work pages · 1 internal anchor

[1]

Auddy, A., & Yuan, M. (2022). On estimating rank-one spiked tensors in the presence of heavy tailed errors.IEEE Transactions on Information Theory, 68(12), 8053-8075. 2

work page 2022
[2]

Belloni, A., Chernozhukov, V ., & Wang, L. (2011). Square-root ridge: pivotal recovery of sparse signals via conic programming. Biometrika, 98(4), 791-806. 4

work page 2011
[3]

Babii, A., Ghysels, E., & Striaukas, J. (2022). Machine learning time series regressions with an appli- cation to nowcasting.Journal of Business & Economic Statistics, 40(3), 1094-1106. 2

work page 2022
[4]

Bubeck, S., Cesa-Bianchi, N., & Lugosi, G. (2013). Bandits with heavy tail.IEEE Transactions on Information Theory, 59(11), 7711-7717. 3

work page 2013
[5]

Bertrand,Q., Massias,M., Gramfort,A., & Salmon, J. (2019). Handling correlated and repeated mea- surements with the smoothed multivariate square-root ridge. Advances in Neural Information Process- ing Systems 32 (NeurIPS 2019). 4

work page 2019
[6]

(2012), Challenging the Empirical Mean and Empirical Variance: A Deviation Study

Catoni, O. (2012), Challenging the Empirical Mean and Empirical Variance: A Deviation Study. Annales de I’Institut Henri Poincar´e— Probabilit ´es et Statistiques, 48, 1148–1185. 2, 3, 4

work page 2012
[7]

Croux, C., Gelper, S., & Mahieu, K. (2010). Robust exponential smoothing of multivariate time series. Computational statistics & data analysis, 54(12), 2999-3006. 2

work page 2010
[8]

and Xu, L., 2021

Chen, P., Jin, X., Li, X. and Xu, L., 2021. A generalized Catoni’s M-estimator under finiteα-th moment assumption withα∈(1,2).Electronic Journal of Statistics, 15(2), pp.5523-5544. 3

work page 2021
[9]

T., Martin, R

Connor, J. T., Martin, R. D., & Atlas, L. E. (1994). Recurrent neural networks and robust time series prediction.IEEE transactions on neural networks, 5(2), 240-254. 2

work page 1994
[10]

and Jo, H.H

Eom,Y .H. and Jo, H.H. (2015), Tail-scope: Using friends to estimate heavy tails of degree distributions in large-scale complex networks.Scientific Reports, vol. 5, 09752. (2015) 2

work page 2015
[11]

Fan, J., Ke, Y ., Sun, Q., & Zhou, W. X. (2019). FarmTest: Factor-adjusted robust multiple testing with approximate false discovery control.Journal of the American Statistical Association, 114(526), 1684-1696. 3

work page 2019
[12]

& Wang, Y

Fan, J., Li, Q. & Wang, Y . (2017). Estimation of high dimensional mean regression in the absence of symmetry and light tail assumptions.Journal of the Royal Statistical Society, Series B, 79(1), 247–265. 2

work page 2017
[13]

Fan, J., Liu, H., & Wang, W. (2018). Large covariance estimation through elliptical factor models. Annals of statistics, 46(4), 1383. 3 JOINT ROBUST ESTIMATION 51

work page 2018
[14]

Finkenstadt, B., & Rootz ´en, H. (Eds.). (2003). Extreme values in finance, telecommunications, and the environment. CRC Press. 2

work page 2003
[15]

Frankowska, H. (2018). The Poincar ´e–Miranda theorem and viability condition.Journal of Mathe- matical Analysis and Applications, 463(2), 832-837. 25, 35, 36, 47, 48

work page 2018
[16]

Fan, J., Wang, W., & Zhong, Y . (2019). Robust covariance estimation for approximate factor models. Journal of econometrics, 208(1), 5-22. 2

work page 2019
[17]

Fan, J., Wang, W., & Zhu, Z. (2021). A shrinkage principle for heavy-tailed data: High-dimensional robust low-rank matrix recovery.Annals of statistics, 49(3), 1239. 2

work page 2021
[18]

Fan, J., Wang, K., Zhong, Y ., & Zhu, Z. (2021). Robust high dimensional factor models with applica- tions to statistical machine learning.Statistical science: a review journal of the Institute of Mathemat- ical Statistics, 36(2), 303. 2

work page 2021
[19]

P., & Xu, H

Guerrier, S., Molinari, R., Victoria-Feser, M. P., & Xu, H. (2022). Robust two-step wavelet-based inference for time series models.Journal of the American Statistical Association, 117(540), 1996-

work page 2022
[20]

Huber, P. J. (1964). Robust estimation of a location parameter.Annals of Mathematical Statistics, 35(1), 73–101. 2

work page 1964
[21]

Huber, P. J. (1973). Robust regression: asymptotics, conjectures and Monte Carlo.The annals of statistics, 799-821. 2, 3

work page 1973
[22]

J., & Ronchetti, E

Huber, P. J., & Ronchetti, E. M. (2011).Robust statistics. John Wiley & Sons. 3

work page 2011
[23]

(2019) Sparse Poisson regression with penalized weighted score function

Jia, J., Xie, F., & Xu, L. (2019) Sparse Poisson regression with penalized weighted score function. Electronic Journal of Statistics, 13(2), 2898-2920. 4

work page 2019
[24]

(2018) Bernstein’s inequalities for general Markov chains.arXiv preprint arXiv:1805.10721

Jiang, B., Sun, Q., & Fan, J. (2018) Bernstein’s inequalities for general Markov chains.arXiv preprint arXiv:1805.10721. 3

work page arXiv 2018
[25]

Ke, Y ., Minsker, S., Ren, Z., Sun, Q., & Zhou, W. X. (2019). User-friendly covariance estimation for heavy-tailed distributions.Statistical Science, 34(3), 454-471. 2

work page 2019
[26]

Lecu ´e, G., & Lerasle, M. (2020). Robust machine learning by median-of-means: theory and practice. Annals of Statistics, 48(2), 906-931. 2

work page 2020
[27]

(2019) ”Sub-Gaussian estimators of the mean of a random vector,”The Annals of Statistics, Ann

Lugosi G., Mendelson S. (2019) ”Sub-Gaussian estimators of the mean of a random vector,”The Annals of Statistics, Ann. Statist. 47(2), 783-794, 2

work page 2019
[28]

(2019) Mean Estimation and Regression Under Heavy-Tailed Distributions: A Survey.Foundations of Computational Mathematics, 19(5), 1145–1190

Lugosi, G., Mendelson, S. (2019) Mean Estimation and Regression Under Heavy-Tailed Distributions: A Survey.Foundations of Computational Mathematics, 19(5), 1145–1190. 2

work page 2019
[29]

Mammen, E. (1989). Asymptotics with increasing dimension for robust regression with applications to the bootstrap.The annals of statistics,382-400. 3

work page 1989
[30]

Minsker, S. (2018). Sub-Gaussian estimators of the mean of a random matrix with heavy-tailed entries. The Annals of Statistics, 46(6A), 2871-2903. 2, 3

work page 2018
[31]

Molstad, A.J. (2022). New Insights for the Multivariate Square-Root ridge.Journal of Machine Learn- ing Research, 23(66):1-52. 4

work page 2022
[32]

M., & Harchaoui, Z

Pillutla, K., Kakade, S. M., & Harchaoui, Z. (2022). Robust aggregation for federated learning.IEEE Transactions on Signal Processing, 70,1142-1154. 2

work page 2022
[33]

S., Balakrishnan, S., & Ravikumar, P

Prasad, A., Suggala, A. S., Balakrishnan, S., & Ravikumar, P. (2020). Robust estimation via robust gradient estimation.Journal of the Royal Statistical Society Series B: Statistical Methodology, 82(3), 601-627. 3

work page 2020
[34]

Qu, L. (2021). A new approach to estimating earnings forecasting models: Robust regression MM- estimation.International Journal of Forecasting, 37(2), 1011-1030. 2

work page 2021
[35]

Do we need to estimate the variance in robust mean estimation.arXiv preprint arXiv:2107.00118.4, 5, 7

Sun, Q., (2021). Do we need to estimate the variance in robust mean estimation.arXiv preprint arXiv:2107.00118.4, 5, 7

work page arXiv 2021
[36]

and Zhang, C.H

Sun, T. and Zhang, C.H. (2012). Scaled sparse linear regression.Biometrika, 99(4), pp.879-898. 4, 5

work page 2012
[37]

& Fan, J

Sun, Q., Zhou, W.X. & Fan, J. (2020). Adaptive huber regression.Journal of the American Statistical Association, 115(529), pp.254-265. 2, 3, 4, 12, 13

work page 2020
[38]

A., & Dezeure, R

Van de Geer, S., B ¨uhlmann, P., Ritov, Y . A., & Dezeure, R. (2014). On asymptotically optimal confi- dence regions and tests for high-dimensional models.Annals of statistics, 42(3), 1166-1202. 8

work page 2014
[39]

Vershynin, R. (2010). Introduction to the non-asymptotic analysis of random matrices.arXiv preprint arXiv:1011.3027.8

work page internal anchor Pith review Pith/arXiv arXiv 2010
[40]

(2018).High-dimensional probability: An introduction with applications in data science (V ol

Vershynin, R. (2018).High-dimensional probability: An introduction with applications in data science (V ol. 47). Cambridge university press. 8 52 X. LI, J. S. LIU, Q. SUN, AND L. XU

work page 2018
[41]

Wang, Y ., Li, G., Xiao, Z., Xu, L., & Zhang, W. (2024). Robust estimation for high-dimensional time series with heavy tails.arXiv preprint arXiv:2411.05217.2, 3

work page arXiv 2024
[42]

Wang, L., Peng, B., & Li, R. (2015). A high-dimensional nonparametric multivariate test for mean vector. Journal of the American Statistical Association, 110(512), 1658-1669. 2

work page 2015
[43]

Wang, H., & Ramdas, A. (2023). Catoni-style confidence sequences for heavy-tailed mean estimation. Stochastic Processes and Their Applications, 163, 168-202. 3

work page 2023
[44]

(2021, October)

Wang, Y ., Zhong, X., He, F., Chen, H., & Tao, D. (2021, October). Huber additive models for non- stationary time series analysis. InInternational conference on learning representations.2, 3

work page 2021
[45]

Wang, L., Zheng, C., Zhou, W., & Zhou, W. X. (2021). A new principle for tuning-free Huber regres- sion.Statistica Sinica, 31(4), 2153-2177. 4

work page 2021
[46]

J., & Maronna, R

Yohai, V . J., & Maronna, R. A. (1979). Asymptotic behavior of M-estimators for the linear model.The Annals of Statistics,258-268. 3

work page 1979
[47]

X., Bose, K., Fan, J., & Liu, H

Zhou, W. X., Bose, K., Fan, J., & Liu, H. (2018). A new perspective on robust M-estimation: Finite sample theory and applications to dependence-adjusted multiple testing.Annals of statistics, 46(5),

work page 2018
[48]

3 DEPARTMENT OFSTATISTICS ANDDATASCIENCE, SOUTHERNUNIVERSITY OFSCIENCE AND TECHNOLOGY Email address:lixiang3@sustech.edu.cn DEPARTMENT OFSTATISTICS ANDDATASCIENCE, TSINGHUAUNIVERSITY Email address:junsliu@tsinghua.edu.cn DEPARTMENT OFSTATISTICALSCIENCES, UNIVERSITY OFTORONTO Email address:qiang.sun@utoronto.ca DEPARTMENT OFMATHEMATICS, UNIVERSITY OFMACAU ...

work page

[1] [1]

Auddy, A., & Yuan, M. (2022). On estimating rank-one spiked tensors in the presence of heavy tailed errors.IEEE Transactions on Information Theory, 68(12), 8053-8075. 2

work page 2022

[2] [2]

Belloni, A., Chernozhukov, V ., & Wang, L. (2011). Square-root ridge: pivotal recovery of sparse signals via conic programming. Biometrika, 98(4), 791-806. 4

work page 2011

[3] [3]

Babii, A., Ghysels, E., & Striaukas, J. (2022). Machine learning time series regressions with an appli- cation to nowcasting.Journal of Business & Economic Statistics, 40(3), 1094-1106. 2

work page 2022

[4] [4]

Bubeck, S., Cesa-Bianchi, N., & Lugosi, G. (2013). Bandits with heavy tail.IEEE Transactions on Information Theory, 59(11), 7711-7717. 3

work page 2013

[5] [5]

Bertrand,Q., Massias,M., Gramfort,A., & Salmon, J. (2019). Handling correlated and repeated mea- surements with the smoothed multivariate square-root ridge. Advances in Neural Information Process- ing Systems 32 (NeurIPS 2019). 4

work page 2019

[6] [6]

(2012), Challenging the Empirical Mean and Empirical Variance: A Deviation Study

Catoni, O. (2012), Challenging the Empirical Mean and Empirical Variance: A Deviation Study. Annales de I’Institut Henri Poincar´e— Probabilit ´es et Statistiques, 48, 1148–1185. 2, 3, 4

work page 2012

[7] [7]

Croux, C., Gelper, S., & Mahieu, K. (2010). Robust exponential smoothing of multivariate time series. Computational statistics & data analysis, 54(12), 2999-3006. 2

work page 2010

[8] [8]

and Xu, L., 2021

Chen, P., Jin, X., Li, X. and Xu, L., 2021. A generalized Catoni’s M-estimator under finiteα-th moment assumption withα∈(1,2).Electronic Journal of Statistics, 15(2), pp.5523-5544. 3

work page 2021

[9] [9]

T., Martin, R

Connor, J. T., Martin, R. D., & Atlas, L. E. (1994). Recurrent neural networks and robust time series prediction.IEEE transactions on neural networks, 5(2), 240-254. 2

work page 1994

[10] [10]

and Jo, H.H

Eom,Y .H. and Jo, H.H. (2015), Tail-scope: Using friends to estimate heavy tails of degree distributions in large-scale complex networks.Scientific Reports, vol. 5, 09752. (2015) 2

work page 2015

[11] [11]

Fan, J., Ke, Y ., Sun, Q., & Zhou, W. X. (2019). FarmTest: Factor-adjusted robust multiple testing with approximate false discovery control.Journal of the American Statistical Association, 114(526), 1684-1696. 3

work page 2019

[12] [12]

& Wang, Y

Fan, J., Li, Q. & Wang, Y . (2017). Estimation of high dimensional mean regression in the absence of symmetry and light tail assumptions.Journal of the Royal Statistical Society, Series B, 79(1), 247–265. 2

work page 2017

[13] [13]

Fan, J., Liu, H., & Wang, W. (2018). Large covariance estimation through elliptical factor models. Annals of statistics, 46(4), 1383. 3 JOINT ROBUST ESTIMATION 51

work page 2018

[14] [14]

Finkenstadt, B., & Rootz ´en, H. (Eds.). (2003). Extreme values in finance, telecommunications, and the environment. CRC Press. 2

work page 2003

[15] [15]

Frankowska, H. (2018). The Poincar ´e–Miranda theorem and viability condition.Journal of Mathe- matical Analysis and Applications, 463(2), 832-837. 25, 35, 36, 47, 48

work page 2018

[16] [16]

Fan, J., Wang, W., & Zhong, Y . (2019). Robust covariance estimation for approximate factor models. Journal of econometrics, 208(1), 5-22. 2

work page 2019

[17] [17]

Fan, J., Wang, W., & Zhu, Z. (2021). A shrinkage principle for heavy-tailed data: High-dimensional robust low-rank matrix recovery.Annals of statistics, 49(3), 1239. 2

work page 2021

[18] [18]

Fan, J., Wang, K., Zhong, Y ., & Zhu, Z. (2021). Robust high dimensional factor models with applica- tions to statistical machine learning.Statistical science: a review journal of the Institute of Mathemat- ical Statistics, 36(2), 303. 2

work page 2021

[19] [19]

P., & Xu, H

Guerrier, S., Molinari, R., Victoria-Feser, M. P., & Xu, H. (2022). Robust two-step wavelet-based inference for time series models.Journal of the American Statistical Association, 117(540), 1996-

work page 2022

[20] [20]

Huber, P. J. (1964). Robust estimation of a location parameter.Annals of Mathematical Statistics, 35(1), 73–101. 2

work page 1964

[21] [21]

Huber, P. J. (1973). Robust regression: asymptotics, conjectures and Monte Carlo.The annals of statistics, 799-821. 2, 3

work page 1973

[22] [22]

J., & Ronchetti, E

Huber, P. J., & Ronchetti, E. M. (2011).Robust statistics. John Wiley & Sons. 3

work page 2011

[23] [23]

(2019) Sparse Poisson regression with penalized weighted score function

Jia, J., Xie, F., & Xu, L. (2019) Sparse Poisson regression with penalized weighted score function. Electronic Journal of Statistics, 13(2), 2898-2920. 4

work page 2019

[24] [24]

(2018) Bernstein’s inequalities for general Markov chains.arXiv preprint arXiv:1805.10721

Jiang, B., Sun, Q., & Fan, J. (2018) Bernstein’s inequalities for general Markov chains.arXiv preprint arXiv:1805.10721. 3

work page arXiv 2018

[25] [25]

Ke, Y ., Minsker, S., Ren, Z., Sun, Q., & Zhou, W. X. (2019). User-friendly covariance estimation for heavy-tailed distributions.Statistical Science, 34(3), 454-471. 2

work page 2019

[26] [26]

Lecu ´e, G., & Lerasle, M. (2020). Robust machine learning by median-of-means: theory and practice. Annals of Statistics, 48(2), 906-931. 2

work page 2020

[27] [27]

(2019) ”Sub-Gaussian estimators of the mean of a random vector,”The Annals of Statistics, Ann

Lugosi G., Mendelson S. (2019) ”Sub-Gaussian estimators of the mean of a random vector,”The Annals of Statistics, Ann. Statist. 47(2), 783-794, 2

work page 2019

[28] [28]

(2019) Mean Estimation and Regression Under Heavy-Tailed Distributions: A Survey.Foundations of Computational Mathematics, 19(5), 1145–1190

Lugosi, G., Mendelson, S. (2019) Mean Estimation and Regression Under Heavy-Tailed Distributions: A Survey.Foundations of Computational Mathematics, 19(5), 1145–1190. 2

work page 2019

[29] [29]

Mammen, E. (1989). Asymptotics with increasing dimension for robust regression with applications to the bootstrap.The annals of statistics,382-400. 3

work page 1989

[30] [30]

Minsker, S. (2018). Sub-Gaussian estimators of the mean of a random matrix with heavy-tailed entries. The Annals of Statistics, 46(6A), 2871-2903. 2, 3

work page 2018

[31] [31]

Molstad, A.J. (2022). New Insights for the Multivariate Square-Root ridge.Journal of Machine Learn- ing Research, 23(66):1-52. 4

work page 2022

[32] [32]

M., & Harchaoui, Z

Pillutla, K., Kakade, S. M., & Harchaoui, Z. (2022). Robust aggregation for federated learning.IEEE Transactions on Signal Processing, 70,1142-1154. 2

work page 2022

[33] [33]

S., Balakrishnan, S., & Ravikumar, P

Prasad, A., Suggala, A. S., Balakrishnan, S., & Ravikumar, P. (2020). Robust estimation via robust gradient estimation.Journal of the Royal Statistical Society Series B: Statistical Methodology, 82(3), 601-627. 3

work page 2020

[34] [34]

Qu, L. (2021). A new approach to estimating earnings forecasting models: Robust regression MM- estimation.International Journal of Forecasting, 37(2), 1011-1030. 2

work page 2021

[35] [35]

Do we need to estimate the variance in robust mean estimation.arXiv preprint arXiv:2107.00118.4, 5, 7

Sun, Q., (2021). Do we need to estimate the variance in robust mean estimation.arXiv preprint arXiv:2107.00118.4, 5, 7

work page arXiv 2021

[36] [36]

and Zhang, C.H

Sun, T. and Zhang, C.H. (2012). Scaled sparse linear regression.Biometrika, 99(4), pp.879-898. 4, 5

work page 2012

[37] [37]

& Fan, J

Sun, Q., Zhou, W.X. & Fan, J. (2020). Adaptive huber regression.Journal of the American Statistical Association, 115(529), pp.254-265. 2, 3, 4, 12, 13

work page 2020

[38] [38]

A., & Dezeure, R

Van de Geer, S., B ¨uhlmann, P., Ritov, Y . A., & Dezeure, R. (2014). On asymptotically optimal confi- dence regions and tests for high-dimensional models.Annals of statistics, 42(3), 1166-1202. 8

work page 2014

[39] [39]

Vershynin, R. (2010). Introduction to the non-asymptotic analysis of random matrices.arXiv preprint arXiv:1011.3027.8

work page internal anchor Pith review Pith/arXiv arXiv 2010

[40] [40]

(2018).High-dimensional probability: An introduction with applications in data science (V ol

Vershynin, R. (2018).High-dimensional probability: An introduction with applications in data science (V ol. 47). Cambridge university press. 8 52 X. LI, J. S. LIU, Q. SUN, AND L. XU

work page 2018

[41] [41]

Wang, Y ., Li, G., Xiao, Z., Xu, L., & Zhang, W. (2024). Robust estimation for high-dimensional time series with heavy tails.arXiv preprint arXiv:2411.05217.2, 3

work page arXiv 2024

[42] [42]

Wang, L., Peng, B., & Li, R. (2015). A high-dimensional nonparametric multivariate test for mean vector. Journal of the American Statistical Association, 110(512), 1658-1669. 2

work page 2015

[43] [43]

Wang, H., & Ramdas, A. (2023). Catoni-style confidence sequences for heavy-tailed mean estimation. Stochastic Processes and Their Applications, 163, 168-202. 3

work page 2023

[44] [44]

(2021, October)

Wang, Y ., Zhong, X., He, F., Chen, H., & Tao, D. (2021, October). Huber additive models for non- stationary time series analysis. InInternational conference on learning representations.2, 3

work page 2021

[45] [45]

Wang, L., Zheng, C., Zhou, W., & Zhou, W. X. (2021). A new principle for tuning-free Huber regres- sion.Statistica Sinica, 31(4), 2153-2177. 4

work page 2021

[46] [46]

J., & Maronna, R

Yohai, V . J., & Maronna, R. A. (1979). Asymptotic behavior of M-estimators for the linear model.The Annals of Statistics,258-268. 3

work page 1979

[47] [47]

X., Bose, K., Fan, J., & Liu, H

Zhou, W. X., Bose, K., Fan, J., & Liu, H. (2018). A new perspective on robust M-estimation: Finite sample theory and applications to dependence-adjusted multiple testing.Annals of statistics, 46(5),

work page 2018

[48] [48]

3 DEPARTMENT OFSTATISTICS ANDDATASCIENCE, SOUTHERNUNIVERSITY OFSCIENCE AND TECHNOLOGY Email address:lixiang3@sustech.edu.cn DEPARTMENT OFSTATISTICS ANDDATASCIENCE, TSINGHUAUNIVERSITY Email address:junsliu@tsinghua.edu.cn DEPARTMENT OFSTATISTICALSCIENCES, UNIVERSITY OFTORONTO Email address:qiang.sun@utoronto.ca DEPARTMENT OFMATHEMATICS, UNIVERSITY OFMACAU ...

work page