Stochastic Tverberg theorems and their applications in multi-class logistic regression, data separability, and centerpoints of data

Jes\'us A. De Loera; Thomas A. Hogan

arxiv: 1907.09698 · v1 · pith:GH7YTD2Pnew · submitted 2019-07-23 · 🧮 math.PR · math.OC· math.ST· stat.TH

Stochastic Tverberg theorems and their applications in multi-class logistic regression, data separability, and centerpoints of data

Jes\'us A. De Loera , Thomas A. Hogan This is my paper

Pith reviewed 2026-05-24 17:31 UTC · model grok-4.3

classification 🧮 math.PR math.OCmath.STstat.TH

keywords stochastic geometryTverberg theoremconvex hullslogistic regressionmaximum likelihood estimationdata separabilitycenterpoints

0 comments

The pith

New stochastic Tverberg theorems bound the probability that m random data classes have convex hulls sharing a common point, and apply the bounds to existence of maximum likelihood estimators in multinomial logistic regression.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes stochastic geometry results that lower-bound the chance that m independently sampled point sets in Euclidean space have convex hulls with nonempty common intersection. These results translate directly into lower bounds on the probability that maximum likelihood estimators exist for multi-class logistic regression. The same framework yields connections to condition-number analysis for gradient methods and to algorithms that locate centerpoints inside data clouds. A reader would care because the theorems supply explicit probabilistic guarantees on data configurations that permit reliable model fitting and optimization.

Core claim

Stochastic Tverberg theorems assert that when points are drawn independently from continuous distributions in general position, the probability that the convex hulls of m labeled classes intersect in a common point admits a positive lower bound depending only on the number of points per class, the number of classes, and the ambient dimension; the same probability bound lower-bounds the chance that multinomial logistic regression admits a finite maximum-likelihood estimator.

What carries the argument

The stochastic Tverberg theorem, which converts the classical combinatorial statement about partitions and convex hulls into a probability statement under independent continuous sampling.

If this is right

Explicit lower bounds exist on the probability that maximum-likelihood estimators exist for multinomial logistic regression with m classes.
The same intersection probabilities control the condition numbers that govern convergence of steepest-descent algorithms for logistic regression.
Centerpoints of finite point clouds can be located by searching for points that realize the guaranteed intersection property with positive probability.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The bounds could guide the minimum number of labeled examples needed per class to make non-existence of the estimator unlikely.
The same sampling model might be reused to obtain probabilistic guarantees for other convex-separability problems in statistical learning.

Load-bearing premise

The data points are drawn independently from continuous distributions and lie in general position in Euclidean space.

What would settle it

Generate thousands of independent samples of m labeled point sets from a continuous distribution such as the standard Gaussian in fixed dimension, compute the empirical frequency of common convex-hull intersection, and check whether this frequency lies below the explicit lower bound given by the theorem for those parameters.

Figures

Figures reproduced from arXiv: 1907.09698 by Jes\'us A. De Loera, Thomas A. Hogan.

**Figure 1.1.** Figure 1.1: A partition in three data classes with tolerance one. All three convex hulls intersect even after any one point is removed. The parameter of tolerance is also significant in studying MLE existence. A natural observation is that t tolerant partitions correspond to robust MLE existence. Any t points, possibly corrupted or outlier data, can be removed and still the convex hulls of the data with each label i… view at source ↗

read the original abstract

We present new stochastic geometry theorems that give bounds on the probability that $m$ random data classes all contain a point in common in their convex hulls. We apply these stochastic separation theorems to obtain bounds on the probability of existence of maximum likelihood estimators in multinomial logistic regression. We also discuss connections to condition numbers for analysis of steepest descent algorithms in logistic regression and to the computation of centerpoints of data clouds.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives stochastic Tverberg theorems bounding the probability that m random point sets have intersecting convex hulls, then maps this to bounds on MLE existence for multinomial logistic regression.

read the letter

The core new piece is the stochastic Tverberg results that quantify the chance m independent random classes in general position have a point common to all their convex hulls. The authors carry this over to give probability statements on when multinomial logistic regression admits a finite maximum likelihood estimator, which fails precisely when the classes are separable in the relevant sense. They also note links to centerpoints and condition numbers for gradient methods in logistic regression. The transfer looks direct because the geometric assumptions match the usual i.i.d. continuous-distribution setup in the regression literature. No extra parameters or self-referential fitting appear in the claims. The geometric statements stand on their own and could be of interest even without the regression application. The main limitation is that the abstract gives no proof details, no explicit constants, and no comparison to earlier bounds on MLE existence or data separability. Without those it is hard to judge how much tighter the new estimates are or whether they change any practical conclusions. The centerpoint and condition-number remarks also stay at the level of discussion rather than developed results. This is for readers already working at the geometry-statistics boundary, especially those who use convex-hull arguments in classification or robust estimation. A specialist in probabilistic combinatorics might extract the theorems for separate use. I would send the paper to peer review; the geometric claims are self-contained enough to merit referee time even if the regression application needs more numerical checks.

Referee Report

0 major / 2 minor

Summary. The manuscript presents new stochastic Tverberg theorems that bound the probability that m random data classes, consisting of points drawn independently from continuous distributions in general position in Euclidean space, have a nonempty common intersection of their convex hulls. These geometric results are applied to derive bounds on the probability that the maximum likelihood estimator exists in multinomial logistic regression, with additional discussion of connections to condition numbers for steepest descent algorithms and to centerpoint computations for data clouds.

Significance. If the central theorems hold, the work supplies explicit, parameter-free probabilistic guarantees for data separability that transfer directly to statistical estimation problems in multi-class logistic regression. This constitutes a concrete bridge between stochastic combinatorial geometry and machine-learning theory, with potential utility for analyzing when MLEs are well-defined and for related algorithmic analyses. The absence of free parameters or ad-hoc entities in the derivations is a positive feature.

minor comments (2)

[Abstract] The abstract and introduction would benefit from an explicit statement of the dimension dependence (or lack thereof) in the probability bounds derived from the stochastic Tverberg theorems.
[Introduction] A short remark clarifying how the general-position assumption is maintained under the continuous-distribution hypothesis would improve readability for readers outside combinatorial geometry.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary, significance assessment, and recommendation of minor revision. No specific major comments were raised in the report.

Circularity Check

0 steps flagged

No significant circularity; derivation self-contained

full rationale

The paper introduces new stochastic Tverberg-type theorems on the probability that m random point sets in general position have intersecting convex hulls, derived from standard combinatorial geometry and probabilistic arguments on continuous distributions. These theorems are then applied externally to bound the probability of MLE existence in multinomial logistic regression via data separability. No load-bearing step reduces by construction to a fitted parameter, self-definition, or self-citation chain; the geometric results stand independently and transfer to the regression setting under the stated assumptions without tautology. This is the normal case of a self-contained mathematical derivation.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The work rests on standard assumptions from convex and stochastic geometry; no free parameters, invented entities, or ad-hoc axioms are visible in the abstract.

axioms (1)

domain assumption Data points drawn independently from continuous distributions in general position
Invoked to ensure Tverberg-type intersection results apply to the random classes.

pith-pipeline@v0.9.0 · 5605 in / 993 out tokens · 28355 ms · 2026-05-24T17:31:12.386414+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Theorem 2.5 (Tverberg Threshold Phenomena for equi-partitions). ... Em,f(m),D is Tverberg with high probability if f(m) ≫ log₂(m)
IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean absolute_floor_iff_bare_distinguishability unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

balanced about some point p ... every hyperplane through p partitions D into two sets of equal measure

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

24 extracted references · 24 canonical work pages · 2 internal anchors

[1]

Albert and J

A. Albert and J. A. Anderson , On the existence of maximum likelihood estimates in logistic regression models, Biometrika, 71 (1984), pp. 1–10, https://doi.org/10.1093/biomet/71.1.1, http://dx.doi.org/10.1093/ biomet/71.1.1, https://arxiv.org/abs//oup/backﬁle/content public/journal/biomet/71/1/10.1093/biomet/ 71.1.1/2/71-1-1.pdf

work page doi:10.1093/biomet/71.1.1 1984
[2]

B´ar´any and P

I. B´ar´any and P. Sober´on, Tverberg’s theorem is 50 years old: A survey , Bull. Amer. Math. Soc., 55 (2018), pp. 459–492, https://doi.org/https://doi.org/10.1090/bull/1634. 11

work page doi:10.1090/bull/1634 2018
[3]

E. J. Cand `es and P. Sur , The phase transition for the existence of the maximum likelihood estimate in high-dimensional logistic regression, 2019. to appear Annals of Statistics

work page 2019
[4]

T. M. Chan , An optimal randomized algorithm for maximum Tukey depth , in Proceedings of the Fifteenth Annual ACM-SIAM Symposium on Discrete Algorithms, ACM, New York, 2004, pp. 430–436

work page 2004
[5]

K. L. Clarkson, D. Eppstein, G. Miller, C. Sturtivant, and S. Teng , Approximating center points with iterative Radon points , International Journal of Computational Geometry & Applications, 6 (1996), pp. 357–377

work page 1996
[6]

T. M. Cover, Geometrical and statistical properties of systems of linear inequalities with applications in pattern recognition, Electronic Computers, IEEE Transactions on, EC-14 (1965), pp. 326 – 334, https://doi.org/10. 1109/PGEC.1965.264137

work page arXiv 1965
[7]

J. A. De Loera, X. Goaoc, F. Meunier, and N. H. Mustafa , The discrete yet ubiquitous theorems of Carath´ eodory, Helly, Sperner, Tucker, and Tverberg, Bull. Amer. Math. Soc. (N.S.), 56 (2019), pp. 415– 511, https://doi.org/10.1090/bull/1653, https://doi.org/10.1090/bull/1653

work page doi:10.1090/bull/1653 2019
[8]

Erd˝os and A

P. Erd˝os and A. R ´enyi, On a classical problem of probability theory , Magyar Tud. Akad. Mat. Kutat´ o Int. K¨ ozl., 6 (1961), pp. 215–220

work page 1961
[9]

R. M. Freund, P. Grigas, and R. Mazumder , Condition Number Analysis of Logistic Regression, and its Implications for Standard First-Order Solution Methods , arXiv e-prints, (2018), arXiv:1810.08727, p. arXiv:1810.08727, https://arxiv.org/abs/1810.08727

work page internal anchor Pith review Pith/arXiv arXiv 2018
[10]

James, D

G. James, D. Witten, T. Hastie, and R. Tibshirani , An introduction to statistical learning , vol. 103 of Springer Texts in Statistics, Springer, New York, 2013, https://doi.org/10.1007/978-1-4614-7138-7, https: //doi.org/10.1007/978-1-4614-7138-7. With applications in R

work page doi:10.1007/978-1-4614-7138-7 2013
[11]

D. G. Larman , On sets projectively equivalent to the vertices of a convex polytope , Bulletin of the London Mathematical Society, 4 (1972), pp. 6–12

work page 1972
[12]

McCullagh and J

P. McCullagh and J. A. Nelder , Generalized linear models, Monographs on Statistics and Applied Prob- ability, Chapman & Hall, London, 1989, https://doi.org/10.1007/978-1-4899-3242-6, https://doi.org/10. 1007/978-1-4899-3242-6. Second edition [of MR0727836]

work page doi:10.1007/978-1-4899-3242-6 1989
[13]

G. L. Miller and D. R. Sheehy , Approximate centerpoints with proofs, Comput. Geom., 43 (2010), pp. 647– 654, https://doi.org/10.1016/j.comgeo.2010.04.006, https://doi.org/10.1016/j.comgeo.2010.04.006

work page doi:10.1016/j.comgeo.2010.04.006 2010
[14]

Mulzer and D

W. Mulzer and D. Werner , Approximating Tverberg points in linear time for any ﬁxed dimension , Discrete & Computational Geometry, 50 (2013), pp. 520–535

work page 2013
[15]

Algorithms for Tverberg's theorem via centerpoint theorems

D. Rolnick and P. Sober ´on, Algorithmic aspects of Tverberg’s theorem , CoRR, abs/1601.03083 (2016), http://arxiv.org/abs/1601.03083, https://arxiv.org/abs/1601.03083

work page internal anchor Pith review Pith/arXiv arXiv 2016
[16]

P. J. Rousseeuw and A. Struyf , Computing location depth and regression depth in higher dimensions , Statistics and Computing, 8 (1998), pp. 193–203, https://doi.org/10.1023/A:1008945009397, https://doi. org/10.1023/A:1008945009397

work page doi:10.1023/a:1008945009397 1998
[17]

Schl¨afli, Gesammelte mathematische Abhandlungen

L. Schl¨afli, Gesammelte mathematische Abhandlungen. Band I , Verlag Birkh¨ auser, Basel, 1950

work page 1950
[18]

M. J. Silvapulle, On the existence of maximum likelihood estimators for the binomial response models , Journal of the Royal Statistical Society. Series B (Methodological), 43 (1981), pp. 310–313, http://www.jstor.org/ stable/2984941

work page arXiv 1981
[19]

Sober´on, Robust Tverberg and colourful Carath´ eodory results via random choice, Combinatorics, Probability and Computing, 27 (2018), p

P. Sober´on, Robust Tverberg and colourful Carath´ eodory results via random choice, Combinatorics, Probability and Computing, 27 (2018), p. 427–440, https://doi.org/10.1017/S0963548317000591

work page doi:10.1017/s0963548317000591 2018
[20]

Sober ´on and R

P. Sober ´on and R. Strausz , A generalisation of Tverberg’s theorem , Discrete Comput. Geom., 47 (2012), pp. 455–460, https://doi.org/10.1007/s00454-011-9379-z, https://doi.org/10.1007/s00454-011-9379-z

work page doi:10.1007/s00454-011-9379-z 2012
[21]

J. W. Tukey, Mathematics and the picturing of data , Proceedings of the International Congress of Mathemati- cians, Vancouver, 1975, 2 (1975), pp. 523–531, https://ci.nii.ac.jp/naid/10029477185/en/

work page arXiv 1975
[22]

Tverberg, A generalization of Radon’s theorem , J

H. Tverberg, A generalization of Radon’s theorem , J. London Math. Soc., 41 (1966), pp. 123–128

work page 1966
[23]

Wagner and E

U. Wagner and E. Welzl, A continuous analogue of the upper bound theorem , Discrete Comput. Geom., 26 (2001), pp. 205–219, https://doi.org/10.1145/336154.336176, https://doi.org/10.1145/336154.336176. ACM Symposium on Computational Geometry (Hong Kong, 2000)

work page doi:10.1145/336154.336176 2001
[24]

J. G. Wendel , A problem in geometric probability. , Mathematica Scandinavia, 11 (1962), pp. 109–112, https: //doi.org/10.7146/math.scand.a-10655, https://www.mscand.dk/article/view/10655. 12

work page doi:10.7146/math.scand.a-10655 1962

[1] [1]

Albert and J

A. Albert and J. A. Anderson , On the existence of maximum likelihood estimates in logistic regression models, Biometrika, 71 (1984), pp. 1–10, https://doi.org/10.1093/biomet/71.1.1, http://dx.doi.org/10.1093/ biomet/71.1.1, https://arxiv.org/abs//oup/backﬁle/content public/journal/biomet/71/1/10.1093/biomet/ 71.1.1/2/71-1-1.pdf

work page doi:10.1093/biomet/71.1.1 1984

[2] [2]

B´ar´any and P

I. B´ar´any and P. Sober´on, Tverberg’s theorem is 50 years old: A survey , Bull. Amer. Math. Soc., 55 (2018), pp. 459–492, https://doi.org/https://doi.org/10.1090/bull/1634. 11

work page doi:10.1090/bull/1634 2018

[3] [3]

E. J. Cand `es and P. Sur , The phase transition for the existence of the maximum likelihood estimate in high-dimensional logistic regression, 2019. to appear Annals of Statistics

work page 2019

[4] [4]

T. M. Chan , An optimal randomized algorithm for maximum Tukey depth , in Proceedings of the Fifteenth Annual ACM-SIAM Symposium on Discrete Algorithms, ACM, New York, 2004, pp. 430–436

work page 2004

[5] [5]

K. L. Clarkson, D. Eppstein, G. Miller, C. Sturtivant, and S. Teng , Approximating center points with iterative Radon points , International Journal of Computational Geometry & Applications, 6 (1996), pp. 357–377

work page 1996

[6] [6]

T. M. Cover, Geometrical and statistical properties of systems of linear inequalities with applications in pattern recognition, Electronic Computers, IEEE Transactions on, EC-14 (1965), pp. 326 – 334, https://doi.org/10. 1109/PGEC.1965.264137

work page arXiv 1965

[7] [7]

J. A. De Loera, X. Goaoc, F. Meunier, and N. H. Mustafa , The discrete yet ubiquitous theorems of Carath´ eodory, Helly, Sperner, Tucker, and Tverberg, Bull. Amer. Math. Soc. (N.S.), 56 (2019), pp. 415– 511, https://doi.org/10.1090/bull/1653, https://doi.org/10.1090/bull/1653

work page doi:10.1090/bull/1653 2019

[8] [8]

Erd˝os and A

P. Erd˝os and A. R ´enyi, On a classical problem of probability theory , Magyar Tud. Akad. Mat. Kutat´ o Int. K¨ ozl., 6 (1961), pp. 215–220

work page 1961

[9] [9]

R. M. Freund, P. Grigas, and R. Mazumder , Condition Number Analysis of Logistic Regression, and its Implications for Standard First-Order Solution Methods , arXiv e-prints, (2018), arXiv:1810.08727, p. arXiv:1810.08727, https://arxiv.org/abs/1810.08727

work page internal anchor Pith review Pith/arXiv arXiv 2018

[10] [10]

James, D

G. James, D. Witten, T. Hastie, and R. Tibshirani , An introduction to statistical learning , vol. 103 of Springer Texts in Statistics, Springer, New York, 2013, https://doi.org/10.1007/978-1-4614-7138-7, https: //doi.org/10.1007/978-1-4614-7138-7. With applications in R

work page doi:10.1007/978-1-4614-7138-7 2013

[11] [11]

D. G. Larman , On sets projectively equivalent to the vertices of a convex polytope , Bulletin of the London Mathematical Society, 4 (1972), pp. 6–12

work page 1972

[12] [12]

McCullagh and J

P. McCullagh and J. A. Nelder , Generalized linear models, Monographs on Statistics and Applied Prob- ability, Chapman & Hall, London, 1989, https://doi.org/10.1007/978-1-4899-3242-6, https://doi.org/10. 1007/978-1-4899-3242-6. Second edition [of MR0727836]

work page doi:10.1007/978-1-4899-3242-6 1989

[13] [13]

G. L. Miller and D. R. Sheehy , Approximate centerpoints with proofs, Comput. Geom., 43 (2010), pp. 647– 654, https://doi.org/10.1016/j.comgeo.2010.04.006, https://doi.org/10.1016/j.comgeo.2010.04.006

work page doi:10.1016/j.comgeo.2010.04.006 2010

[14] [14]

Mulzer and D

W. Mulzer and D. Werner , Approximating Tverberg points in linear time for any ﬁxed dimension , Discrete & Computational Geometry, 50 (2013), pp. 520–535

work page 2013

[15] [15]

Algorithms for Tverberg's theorem via centerpoint theorems

D. Rolnick and P. Sober ´on, Algorithmic aspects of Tverberg’s theorem , CoRR, abs/1601.03083 (2016), http://arxiv.org/abs/1601.03083, https://arxiv.org/abs/1601.03083

work page internal anchor Pith review Pith/arXiv arXiv 2016

[16] [16]

P. J. Rousseeuw and A. Struyf , Computing location depth and regression depth in higher dimensions , Statistics and Computing, 8 (1998), pp. 193–203, https://doi.org/10.1023/A:1008945009397, https://doi. org/10.1023/A:1008945009397

work page doi:10.1023/a:1008945009397 1998

[17] [17]

Schl¨afli, Gesammelte mathematische Abhandlungen

L. Schl¨afli, Gesammelte mathematische Abhandlungen. Band I , Verlag Birkh¨ auser, Basel, 1950

work page 1950

[18] [18]

M. J. Silvapulle, On the existence of maximum likelihood estimators for the binomial response models , Journal of the Royal Statistical Society. Series B (Methodological), 43 (1981), pp. 310–313, http://www.jstor.org/ stable/2984941

work page arXiv 1981

[19] [19]

Sober´on, Robust Tverberg and colourful Carath´ eodory results via random choice, Combinatorics, Probability and Computing, 27 (2018), p

P. Sober´on, Robust Tverberg and colourful Carath´ eodory results via random choice, Combinatorics, Probability and Computing, 27 (2018), p. 427–440, https://doi.org/10.1017/S0963548317000591

work page doi:10.1017/s0963548317000591 2018

[20] [20]

Sober ´on and R

P. Sober ´on and R. Strausz , A generalisation of Tverberg’s theorem , Discrete Comput. Geom., 47 (2012), pp. 455–460, https://doi.org/10.1007/s00454-011-9379-z, https://doi.org/10.1007/s00454-011-9379-z

work page doi:10.1007/s00454-011-9379-z 2012

[21] [21]

J. W. Tukey, Mathematics and the picturing of data , Proceedings of the International Congress of Mathemati- cians, Vancouver, 1975, 2 (1975), pp. 523–531, https://ci.nii.ac.jp/naid/10029477185/en/

work page arXiv 1975

[22] [22]

Tverberg, A generalization of Radon’s theorem , J

H. Tverberg, A generalization of Radon’s theorem , J. London Math. Soc., 41 (1966), pp. 123–128

work page 1966

[23] [23]

Wagner and E

U. Wagner and E. Welzl, A continuous analogue of the upper bound theorem , Discrete Comput. Geom., 26 (2001), pp. 205–219, https://doi.org/10.1145/336154.336176, https://doi.org/10.1145/336154.336176. ACM Symposium on Computational Geometry (Hong Kong, 2000)

work page doi:10.1145/336154.336176 2001

[24] [24]

J. G. Wendel , A problem in geometric probability. , Mathematica Scandinavia, 11 (1962), pp. 109–112, https: //doi.org/10.7146/math.scand.a-10655, https://www.mscand.dk/article/view/10655. 12

work page doi:10.7146/math.scand.a-10655 1962