Max-Affine Regression: Provable, Tractable, and Near-Optimal Statistical Estimation
Pith reviewed 2026-05-25 18:24 UTC · model grok-4.3
The pith
Alternating minimization with a spectral initializer recovers the coefficients of a max-affine regression model at near-parametric rates.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
When the design is random and sub-Gaussian and k is fixed, the alternating minimization algorithm, started from a spectral method combined with random search in a low-dimensional space, converges with high probability at a geometric rate to a small neighborhood of the optimal coefficients and attains a near-parametric, minimax-optimal estimation rate up to poly-log factors.
What carries the argument
Alternating minimization on the non-convex least-squares objective, initialized by a spectral method plus random search over a low-dimensional subspace.
If this is right
- The same procedure yields new guarantees for classical phase retrieval under weaker assumptions on the measurement distribution than previously required.
- The approach supplies an explicit, implementable way to regularize convex-regression-type problems and thereby escape the curse of dimensionality.
- Estimation remains tractable and statistically efficient even when the unknown function is piecewise affine with a constant number of pieces.
- The analysis directly produces high-probability bounds that depend on dimension, sample size, and noise variance in the expected minimax-optimal way.
Where Pith is reading between the lines
- The initializer construction may extend to other non-convex regression problems that admit a low-dimensional spectral signal.
- The geometric convergence result suggests that similar alternating schemes could be analyzed for piecewise-linear models with slowly growing k.
- The near-optimal rates indicate that max-affine regression can serve as a computationally friendly surrogate for general convex regression in moderate dimensions.
Load-bearing premise
The covariates follow a random sub-Gaussian design and the number of affine pieces k is treated as a fixed constant; the geometric convergence further requires the initializer to land inside the basin of attraction of the global optimum.
What would settle it
A dataset drawn from the stated random design in which the alternating-minimization iterates either fail to contract geometrically or produce estimation error that exceeds the claimed near-parametric bound by more than a poly-log factor in the dimension.
Figures
read the original abstract
Max-affine regression refers to a model where the unknown regression function is modeled as a maximum of $k$ unknown affine functions for a fixed $k \geq 1$. This generalizes linear regression and (real) phase retrieval, and is closely related to convex regression. Working within a non-asymptotic framework, we study this problem in the high-dimensional setting assuming that $k$ is a fixed constant, and focus on estimation of the unknown coefficients of the affine functions underlying the model. We analyze a natural alternating minimization (AM) algorithm for the non-convex least squares objective when the design is random. We show that the AM algorithm, when initialized suitably, converges with high probability and at a geometric rate to a small ball around the optimal coefficients. In order to initialize the algorithm, we propose and analyze a combination of a spectral method and a random search scheme in a low-dimensional space, which may be of independent interest. The final rate that we obtain is near-parametric and minimax optimal (up to a poly-logarithmic factor) as a function of the dimension, sample size, and noise variance. In that sense, our approach should be viewed as a direct and implementable method of enforcing regularization to alleviate the curse of dimensionality in problems of the convex regression type. As a by-product of our analysis, we also obtain guarantees on a classical algorithm for the phase retrieval problem under considerably weaker assumptions on the design distribution than was previously known. Numerical experiments illustrate the sharpness of our bounds in the various problem parameters.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper studies max-affine regression (regression function is the pointwise maximum of k fixed affine functions) in the high-dimensional regime under random design. It analyzes a natural alternating minimization (AM) algorithm on the non-convex least-squares objective and shows that, when initialized by a combination of a spectral method and random search in a low-dimensional space, the iterates converge geometrically with high probability to a small ball around the global optimum. The resulting estimator achieves near-parametric, minimax-optimal rates (up to poly-log factors) in dimension, sample size, and noise variance. A byproduct is improved guarantees for phase retrieval under weaker design assumptions.
Significance. If the claims hold, the work supplies a direct, implementable algorithm with non-asymptotic guarantees for a natural non-convex generalization of linear regression that includes real phase retrieval and is closely related to convex regression. The near-optimal rates for fixed k demonstrate that the method enforces regularization sufficient to escape the curse of dimensionality. The phase-retrieval corollary under weaker assumptions is a concrete additional contribution.
major comments (2)
- [§4 and Theorem 1] §4 (initializer analysis) and the statement of Theorem 1: the claimed high-probability success of the spectral-plus-random-search initializer rests on a covering or union-bound argument whose failure probability is controlled only up to poly(k) factors. Because the random-search component operates in a space whose effective dimension scales with the ambient dimension d of each affine piece, the overall success probability can fall below 1-o(1) as d grows even for fixed k. This directly undermines the geometric-convergence guarantee for the subsequent AM iterates, which is load-bearing for the central claim.
- [Theorem 2] Theorem 2 (final statistical rate): the near-minimax optimality is stated to hold up to a poly-logarithmic factor, but the precise dependence of the poly-log term on k and the failure probability of the initializer is not made explicit; without this, it is impossible to verify that the rate remains near-optimal once the initializer success probability is properly bounded.
minor comments (2)
- [Eq. (1)] Notation for the max-affine model (Eq. (1) or (2)) should explicitly state whether the k affine pieces share a common intercept or are allowed distinct intercepts; the current presentation leaves this ambiguous for readers.
- [Numerical experiments] The numerical experiments section would benefit from a plot or table that isolates the effect of the random-search radius on initialization success probability as d increases.
Simulated Author's Rebuttal
We thank the referee for the careful reading and constructive comments on the initializer analysis and statistical rates. We address the two major comments point by point below and will revise the manuscript to incorporate explicit probability bounds and dependence on k.
read point-by-point responses
-
Referee: [§4 and Theorem 1] §4 (initializer analysis) and the statement of Theorem 1: the claimed high-probability success of the spectral-plus-random-search initializer rests on a covering or union-bound argument whose failure probability is controlled only up to poly(k) factors. Because the random-search component operates in a space whose effective dimension scales with the ambient dimension d of each affine piece, the overall success probability can fall below 1-o(1) as d grows even for fixed k. This directly undermines the geometric-convergence guarantee for the subsequent AM iterates, which is load-bearing for the central claim.
Authors: We thank the referee for highlighting the need for explicit control on the failure probability. The random-search component is performed in a low-dimensional space whose dimension depends only on k (specifically, a net over a k-dimensional parameter space after the spectral step reduces the problem), not on the ambient d. The union bound therefore incurs only a poly(k) factor, and the per-point success probability of the random search yields an overall initializer success probability of 1 - O(poly(k) exp(-c d)) for a positive constant c independent of k and d. For any fixed k this tends to 1 as d grows, preserving the high-probability geometric convergence of the subsequent AM iterates. We will revise §4 and Theorem 1 to state this bound explicitly. revision: yes
-
Referee: [Theorem 2] Theorem 2 (final statistical rate): the near-minimax optimality is stated to hold up to a poly-logarithmic factor, but the precise dependence of the poly-log term on k and the failure probability of the initializer is not made explicit; without this, it is impossible to verify that the rate remains near-optimal once the initializer success probability is properly bounded.
Authors: We agree that the dependence should be stated explicitly. The poly-logarithmic factor in Theorem 2 arises from the covering numbers and concentration inequalities and is of the form (log(d n / δ))^{O(1)}, where δ is the initializer failure probability. Because δ = O(poly(k) exp(-c d)) with k fixed, the extra factors remain polylogarithmic in d and n (with polynomial dependence on k only). The resulting rate is therefore still near-minimax optimal up to these explicit factors. We will revise Theorem 2 and the surrounding text to display the precise dependence on k and δ. revision: yes
Circularity Check
No circularity: direct non-asymptotic analysis of explicit algorithm under random-design assumptions
full rationale
The paper states a direct analysis of the alternating minimization algorithm (initialized via spectral method plus random search in low-dimensional space) under random sub-Gaussian design with fixed k. Convergence at geometric rate to a small ball around the optimum, and the final near-parametric minimax rate (up to poly-log factors), are derived from the stated assumptions without any reduction of the claimed quantities to fitted parameters, self-defined objects, or load-bearing self-citations. The initializer success is part of the analysis rather than presupposed by the target result. No equations or steps reduce the prediction to the input by construction.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Design matrix entries are drawn from a random distribution (sub-Gaussian or comparable).
- domain assumption k is a fixed constant independent of dimension and sample size.
Forward citations
Cited by 2 Pith papers
-
Locally Near Optimal Piecewise Linear Regression in High Dimensions via Difference of Max-Affine Functions
ABGD parametrizes piecewise linear functions as difference of max-affine functions and converges linearly to an epsilon-accurate solution with O(d max(sigma/epsilon,1)^2) samples under sub-Gaussian noise, which is min...
-
Expectation Maximization (EM) Converges for General Agnostic Mixtures
Gradient EM converges exponentially to optimal population loss minimizers for agnostic fitting of k parametric functions under strong convexity and smoothness of the loss, proper initialization, and separation conditions.
Reference graph
Works this paper leans on
-
[1]
Convex regression: theory, practice, and applications
G \'a bor Bal \'a zs. Convex regression: theory, practice, and applications . PhD thesis, University of Alberta, 2016
work page 2016
-
[2]
Slice inverse regression with score functions
Dmitry Babichev and Francis Bach. Slice inverse regression with score functions. Electronic Journal of Statistics , 12(1):1507--1543, 2018
work page 2018
-
[3]
Missing values in multivariate analysis
Evelyn ML Beale and Roderick JA Little. Missing values in multivariate analysis. Journal of the Royal Statistical Society: Series B (Methodological) , 37(1):129--145, 1975
work page 1975
-
[4]
-entropy of convex sets and functions
EM Bronshtein. -entropy of convex sets and functions. Siberian Mathematical Journal , 17(3):393--398, 1976
work page 1976
-
[5]
The truncated normal distribution
John Burkardt. The truncated normal distribution. 2014
work page 2014
-
[6]
Statistical guarantees for the EM algorithm: From population to sample-based analysis
Sivaraman Balakrishnan, Martin J Wainwright, and Bin Yu. Statistical guarantees for the EM algorithm: From population to sample-based analysis. The Annals of Statistics , 45(1):77--120, 2017
work page 2017
-
[7]
Solving random quadratic systems of equations is nearly as easy as solving linear systems
Yuxin Chen and Emmanuel Candes. Solving random quadratic systems of equations is nearly as easy as solving linear systems. In Advances in Neural Information Processing Systems , pages 739--747, 2015
work page 2015
-
[8]
On EM algorithms and their proximal generalizations
St \'e phane Chr \'e tien and Alfred O Hero. On EM algorithms and their proximal generalizations. ESAIM: Probability and Statistics , 12:308--326, 2008
work page 2008
-
[9]
Optimal Bounds on the VC-dimension
Monika Csikos, Andrey Kupavskii, and Nabil H Mustafa. Optimal bounds on the VC -dimension. arXiv preprint arXiv:1807.07924 , 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[10]
Spectral experts for estimating mixtures of linear regressions
Arun Tejasvi Chaganty and Percy Liang. Spectral experts for estimating mixtures of linear regressions. In International Conference on Machine Learning , pages 1040--1048, 2013
work page 2013
-
[11]
Optimal rates of convergence for noisy sparse phase retrieval via thresholded W irtinger flow
T Tony Cai, Xiaodong Li, and Zongming Ma. Optimal rates of convergence for noisy sparse phase retrieval via thresholded W irtinger flow. The Annals of Statistics , 44(5):2221--2251, 2016
work page 2016
-
[12]
Phase retrieval via wirtinger flow: Theory and algorithms
Emmanuel J Candes, Xiaodong Li, and Mahdi Soltanolkotabi. Phase retrieval via wirtinger flow: Theory and algorithms. IEEE Transactions on Information Theory , 61(4):1985--2007, 2015
work page 1985
-
[13]
Array imaging using intensity-only measurements
Anwei Chai, Miguel Moscoso, and George Papanicolaou. Array imaging using intensity-only measurements. Inverse Problems , 27(1):015005, 2010
work page 2010
-
[14]
Distributional and l^q norm inequalities for polynomials over convex bodies in R ^n
Anthony Carbery and James Wright. Distributional and l^q norm inequalities for polynomials over convex bodies in R ^n . Mathematical research letters , 8(3):233--248, 2001
work page 2001
-
[15]
Solving (most) of a set of quadratic equalities: Composite optimization for robust phase retrieval
John C Duchi and Feng Ruan. Solving (most) of a set of quadratic equalities: Composite optimization for robust phase retrieval. arXiv preprint arXiv:1705.02356 , 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[16]
Multiclass learning approaches: A theoretical comparison with implications
Amit Daniely, Sivan Sabato, and Shai S Shwartz. Multiclass learning approaches: A theoretical comparison with implications. In Advances in Neural Information Processing Systems , pages 485--493, 2012
work page 2012
-
[17]
Ten steps of EM suffice for mixtures of two G aussians
Constantinos Daskalakis, Christos Tzamos, and Manolis Zampetakis. Ten steps of EM suffice for mixtures of two G aussians. In 30th Annual Conference on Learning Theory , 2017
work page 2017
-
[18]
Phase retrieval: Stability and recovery guarantees
Yonina C Eldar and Shahar Mendelson. Phase retrieval: Stability and recovery guarantees. Applied and Computational Harmonic Analysis , 36(3):473--494, 2014
work page 2014
-
[19]
Phase retrieval and image reconstruction for astronomy
C Fienup and J Dainty. Phase retrieval and image reconstruction for astronomy. Image Recovery: Theory and Application , 231:275, 1987
work page 1987
-
[20]
Phase retrieval algorithms: a comparison
James R Fienup. Phase retrieval algorithms: a comparison. Applied optics , 21(15):2758--2769, 1982
work page 1982
-
[21]
Phase retrieval from very few measurements
Matthew Fickus, Dustin G Mixon, Aaron A Nelson, and Yang Wang. Phase retrieval from very few measurements. Linear Algebra and its Applications , 449:475--499, 2014
work page 2014
-
[22]
Phase retrieval for imaging problems
Fajwel Fogel, Ir \`e ne Waldspurger, and Alexandre d'Aspremont. Phase retrieval for imaging problems. Mathematical programming computation , 8(3):311--335, 2016
work page 2016
-
[23]
Richard J. Gardner. Geometric Tomography . Encyclopedia of Mathematics and its Applications. Cambridge University Press, 2 edition, 2006
work page 2006
-
[24]
Jens Gregor and Fernando R Rannou. Three-dimensional support function estimation and application for projection magnetic resonance imaging. International journal of imaging systems and technology , 12(1):43--50, 2002
work page 2002
-
[25]
A practical algorithm for the determination of phase from image and diffraction plane pictures
R Gerchberg and W Saxton. A practical algorithm for the determination of phase from image and diffraction plane pictures. Optik , 35:237--246, 1972
work page 1972
-
[26]
Covering numbers for convex functions
Adityanand Guntuboyina and Bodhisattva Sen. Covering numbers for convex functions. IEEE Transactions on Information Theory , 59(4):1957--1965, 2013
work page 1957
-
[27]
Optimal rates of convergence for convex set estimation from support functions
Adityanand Guntuboyina. Optimal rates of convergence for convex set estimation from support functions. The Annals of Statistics , 40(1):385--411, 2012
work page 2012
-
[28]
Entropy of convex functions on R ^d
Fuchang Gao and Jon A Wellner. Entropy of convex functions on R ^d . Constructive approximation , 46(3):565--592, 2017
work page 2017
-
[29]
Maximum likelihood estimation from incomplete data
Herman O Hartley. Maximum likelihood estimation from incomplete data. Biometrics , 14(2):174--194, 1958
work page 1958
-
[30]
Phase problem in crystallography
Robert W Harrison. Phase problem in crystallography. JOSA a , 10(5):1046--1055, 1993
work page 1993
-
[31]
Gilles Harg \'e . A convex/log-concave correlation inequality for G aussian measure and an application to abstract wiener spaces. Probability theory and related fields , 130(3):415--440, 2004
work page 2004
-
[32]
Multivariate convex regression with adaptive partitioning
Lauren A Hannah and David B Dunson. Multivariate convex regression with adaptive partitioning. The Journal of Machine Learning Research , 14(1):3261--3294, 2013
work page 2013
-
[33]
Learning mixtures of spherical G aussians: moment methods and spectral decompositions
Daniel Hsu and Sham M Kakade. Learning mixtures of spherical G aussians: moment methods and spectral decompositions. In Proceedings of the 4th conference on Innovations in Theoretical Computer Science , pages 11--20. ACM, 2013
work page 2013
-
[34]
Semiparametric and nonparametric methods in econometrics , volume 12
Joel L Horowitz. Semiparametric and nonparametric methods in econometrics , volume 12. Springer, 2009
work page 2009
-
[35]
It \^o -wiener chaos expansion with exact residual and correlation, variance inequalities
Yaozhong Hu. It \^o -wiener chaos expansion with exact residual and correlation, variance inequalities. Journal of Theoretical Probability , 10(4):835--848, 1997
work page 1997
-
[36]
Multivariate convex regression: global risk bounds and adaptation
Qiyang Han and Jon A Wellner. Multivariate convex regression: global risk bounds and adaptation. arXiv preprint arXiv:1601.06844 , 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[37]
Hierarchical mixtures of experts and the EM algorithm
Michael I Jordan and Robert A Jacobs. Hierarchical mixtures of experts and the EM algorithm. Neural computation , 6(2):181--214, 1994
work page 1994
-
[38]
Adaptive mixtures of local experts
Robert A Jacobs, Michael I Jordan, Steven J Nowlan, and Geoffrey E Hinton. Adaptive mixtures of local experts. Neural computation , 3(1):79--87, 1991
work page 1991
-
[39]
Limit distribution of the sum and maximum from multivariate G aussian sequences
Barry James, Kang James, and Yongcheng Qi. Limit distribution of the sum and maximum from multivariate G aussian sequences. Journal of multivariate analysis , 98(3):517--532, 2007
work page 2007
-
[40]
Chi Jin, Yuchen Zhang, Sivaraman Balakrishnan, Martin J Wainwright, and Michael I Jordan. Local maxima in the likelihood of G aussian mixture models: Structural results and algorithmic consequences. In Advances in neural information processing systems , pages 4116--4124, 2016
work page 2016
-
[41]
Matrix completion from a few entries
Raghunandan H Keshavan, Andrea Montanari, and Sewoong Oh. Matrix completion from a few entries. IEEE transactions on information theory , 56(6):2980--2998, 2010
work page 2010
-
[42]
Reduction of variance for G aussian densities via restriction to convex sets
Marek Kanter and Harold Proppe. Reduction of variance for G aussian densities via restriction to convex sets. Journal of Multivariate Analysis , 7(1):74--81, 1977
work page 1977
-
[43]
Global Convergence of EM Algorithm for Mixtures of Two Component Linear Regression
Jeongyeol Kwon, Wei Qian, Constantine Caramanis, Yudong Chen, and Damek Davis. Global convergence of EM algorithm for mixtures of two component linear regression. arXiv preprint arXiv:1810.05752v3 , 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[44]
The concentration of measure phenomenon
Michel Ledoux. The concentration of measure phenomenon . Number 89. American Mathematical Soc., 2001
work page 2001
-
[45]
Consistency of multidimensional convex regression
Eunji Lim and Peter W Glynn. Consistency of multidimensional convex regression. Operations Research , 60(1):196--208, 2012
work page 2012
-
[46]
Sliced inverse regression for dimension reduction
Ker-Chau Li. Sliced inverse regression for dimension reduction. Journal of the American Statistical Association , 86(414):316--327, 1991
work page 1991
-
[47]
Siegel's formula via S tein's identities
Jun S Liu. Siegel's formula via S tein's identities. Statistics & Probability Letters , 21(3):247--251, 1994
work page 1994
-
[48]
Minimax rate of convergence and the performance of ERM in phase recovery
Guillaume Lecu \'e and Shahar Mendelson. Minimax rate of convergence and the performance of ERM in phase recovery. arXiv preprint arXiv:1311.5024 , 2013
work page internal anchor Pith review Pith/arXiv arXiv 2013
-
[49]
Convex piecewise-linear fitting
Alessandro Magnani and Stephen P Boyd. Convex piecewise-linear fitting. Optimization and Engineering , 10(1):1--17, 2009
work page 2009
-
[50]
A computational framework for multivariate convex regression and its variants
Rahul Mazumder, Arkopal Choudhury, Garud Iyengar, and Bodhisattva Sen. A computational framework for multivariate convex regression and its variants. Journal of the American Statistical Association , 114(525):318--331, 2019
work page 2019
-
[51]
Learning theory and algorithms for revenue optimization in second price auctions with reserve
Andres M Medina and Mehryar Mohri. Learning theory and algorithms for revenue optimization in second price auctions with reserve. In Proceedings of the 31st International Conference on Machine Learning (ICML-14) , pages 262--270, 2014
work page 2014
-
[52]
Jamie Morgenstern and Tim Roughgarden. Learning simple auctions. In Conference on Learning Theory , pages 1298--1318, 2016
work page 2016
-
[53]
Foundations of machine learning
Mehryar Mohri, Afshin Rostamizadeh, and Ameet Talwalkar. Foundations of machine learning . MIT press, 2018
work page 2018
-
[54]
Moments calculation for the double truncated multivariate normal density, 2009
BG Manjunath and Stefan Wilhelm. Moments calculation for the double truncated multivariate normal density, 2009
work page 2009
-
[55]
Phase retrieval using alternating minimization
Praneeth Netrapalli, Prateek Jain, and Sujay Sanghavi. Phase retrieval using alternating minimization. In Advances in Neural Information Processing Systems , pages 2796--2804, 2013
work page 2013
-
[56]
Small ball probability estimates for log-concave measures
Grigoris Paouris. Small ball probability estimates for log-concave measures. Transactions of the American Mathematical Society , 364(1):287--308, 2012
work page 2012
-
[57]
Reconstructing convex sets from support line measurements
Jerry Ladd Prince and Alan S Willsky. Reconstructing convex sets from support line measurements. IEEE Transactions on Pattern Analysis and Machine Intelligence , 12(4):377--389, 1990
work page 1990
-
[58]
The L ittlewood-- O fford problem and invertibility of random matrices
Mark Rudelson and Roman Vershynin. The L ittlewood-- O fford problem and invertibility of random matrices. Advances in Mathematics , 218(2):600--633, 2008
work page 2008
-
[59]
Hanson-wright inequality and sub- G aussian concentration
Mark Rudelson and Roman Vershynin. Hanson-wright inequality and sub- G aussian concentration. Electronic Communications in Probability , 18, 2013
work page 2013
-
[60]
Small ball probabilities for linear images of high-dimensional distributions
Mark Rudelson and Roman Vershynin. Small ball probabilities for linear images of high-dimensional distributions. International Mathematics Research Notices , 2015(19):9594--9617, 2014
work page 2015
-
[61]
Fitting tractable convex sets to support function evaluations
Yong Sheng Soh and Venkat Chandrasekaran. Fitting tractable convex sets to support function evaluations. arXiv preprint arXiv:1903.04194 , 2019
-
[62]
A surprising covariance involving the minimum of multivariate normal variables
Andrew F Siegel. A surprising covariance involving the minimum of multivariate normal variables. Journal of the American Statistical Association , 88(421):77--80, 1993
work page 1993
-
[63]
Provable tensor methods for learning mixtures of generalized linear models
Hanie Sedghi, Majid Janzamin, and Anima Anandkumar. Provable tensor methods for learning mixtures of generalized linear models. Proceedings of Machine Learning Research , 51:1223--1231, 2014
work page 2014
-
[64]
Fitting Convex Sets to Data: Algorithms and Applications
Yong Sheng Soh. Fitting Convex Sets to Data: Algorithms and Applications . PhD thesis, California Institute of Technology, 2019
work page 2019
-
[65]
Nonparametric least squares estimation of a multivariate convex regression function
Emilio Seijo and Bodhisattva Sen. Nonparametric least squares estimation of a multivariate convex regression function. The Annals of Statistics , 39(3):1633--1657, 2011
work page 2011
-
[66]
Iterative least trimmed squares for mixed linear regression
Yanyao Shen and Sujay Sanghavi. Iterative least trimmed squares for mixed linear regression. arXiv preprint arXiv:1902.03653 , 2019
-
[67]
Phase Retrieval via Randomized Kaczmarz: Theoretical Guarantees
Yan Shuo Tan and Roman Vershynin . Phase retrieval via randomized K aczmarz: Theoretical guarantees. arXiv e-prints , page arXiv:1706.09993, Jun 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[68]
The moment generating function of the truncated multi-normal distribution
Georges M Tallis. The moment generating function of the truncated multi-normal distribution. Journal of the Royal Statistical Society: Series B (Methodological) , 23(1):223--229, 1961
work page 1961
-
[69]
An analysis of the EM algorithm and entropy-like proximal point methods
Paul Tseng. An analysis of the EM algorithm and entropy-like proximal point methods. Mathematics of Operations Research , 29(1):27--44, 2004
work page 2004
- [70]
-
[71]
The uniform convergence of frequencies of the appearance of events to their probabilities
Vladimir N Vapnik and Aleksei Yakovlevich Chervonenkis. The uniform convergence of frequencies of the appearance of events to their probabilities. In Doklady Akademii Nauk , volume 181, pages 781--783. Russian Academy of Sciences, 1968
work page 1968
-
[72]
Regression analysis and empirical processes , volume 45 of CWI Tract
Sara A van de Geer. Regression analysis and empirical processes , volume 45 of CWI Tract . Stichting Mathematisch Centrum, Centrum voor Wiskunde en Informatica, Amsterdam, 1988
work page 1988
-
[73]
Aad W van der Vaart and Jon A. Wellner. Weak convergence and empirical processes . Springer Series in Statistics. Springer-Verlag, New York, 1996. With applications to statistics
work page 1996
-
[74]
Learning convex concepts from G aussian distributions with PCA
Santosh S Vempala. Learning convex concepts from G aussian distributions with PCA . In 2010 IEEE 51st Annual Symposium on Foundations of Computer Science , pages 124--130. IEEE, 2010
work page 2010
-
[75]
Introduction to the non-asymptotic analysis of random matrices
Roman Vershynin. Introduction to the non-asymptotic analysis of random matrices. arXiv preprint arXiv:1011.3027 , 2010
work page internal anchor Pith review Pith/arXiv arXiv 2010
-
[76]
High-dimensional probability: An introduction with applications in data science , volume 47
Roman Vershynin. High-dimensional probability: An introduction with applications in data science , volume 47. Cambridge University Press, 2018
work page 2018
-
[77]
High-dimensional statistics: A non-asymptotic viewpoint , volume 48
Martin J Wainwright. High-dimensional statistics: A non-asymptotic viewpoint , volume 48. Cambridge University Press, 2019
work page 2019
-
[78]
Phase retrieval with random G aussian sensing vectors by alternating projections
Ir \`e ne Waldspurger . Phase retrieval with random G aussian sensing vectors by alternating projections. IEEE Transactions on Information Theory , 64(5):3301--3312, May 2018
work page 2018
-
[79]
All of nonparametric statistics
Larry Wasserman. All of nonparametric statistics . Springer Science & Business Media, 2006
work page 2006
-
[80]
On the convergence properties of the EM algorithm
CF Jeff Wu. On the convergence properties of the EM algorithm. The Annals of statistics , 11(1):95--103, 1983
work page 1983
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.