pith. sign in

arxiv: 1907.08306 · v1 · pith:FUVNP7NAnew · submitted 2019-07-18 · 💻 cs.DS · stat.CO

A Polynomial Time Algorithm for Log-Concave Maximum Likelihood via Locally Exponential Families

Pith reviewed 2026-05-24 19:09 UTC · model grok-4.3

classification 💻 cs.DS stat.CO
keywords log-concave distributionsmaximum likelihood estimationpolynomial-time algorithmsconvex optimizationexponential familiesdensity estimationapproximation algorithmsmultivariate distributions
0
0 comments X

The pith

A polynomial-time algorithm approximates the maximum-likelihood log-concave distribution on n points in R^d to additive error ε.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper gives an algorithm that takes n points in R^d and ε>0 and returns, in time polynomial in n, d, and 1/ε, a log-concave distribution whose likelihood on the points is at most ε below the best possible log-concave likelihood, with high probability. The central step is to show that the optimal log-concave distribution, though not itself an exponential family, locally satisfies the same first-order conditions that exponential families obey. This local property converts the maximum-likelihood problem into a convex program whose subgradients can be approximated efficiently enough for a first-order method to succeed. A sympathetic reader would care because the result supplies the first provably efficient method for a core task in nonparametric statistics that had previously lacked polynomial-time solutions.

Core claim

The maximum likelihood log-concave distribution belongs to a class of structured distributions which, while not an exponential family, locally possesses key properties of exponential families. This connection allows the problem of computing the log-concave maximum likelihood distribution to be formulated as a convex optimization problem, and solved via an approximate first-order method. The resulting algorithm runs in poly(n,d,1/ε) time and, with high probability, returns a distribution whose likelihood on the input points is at most an additive ε less than the optimum achievable by any log-concave distribution.

What carries the argument

The local connection to exponential families, which turns the MLE problem into a convex program solvable by an approximate first-order method with carefully approximated subgradients.

If this is right

  • The MLE log-concave distribution can be approximated to additive ε in time polynomial in n, d, and 1/ε.
  • The approximation is obtained by solving a convex program that exploits the local exponential-family structure of the optimum.
  • Efficient approximation of the required subgradients is possible despite the technical delicacy of the estimation step.
  • The guarantee holds with high probability over the algorithm's internal randomness.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same local-structure argument may apply to other families that are not globally exponential but satisfy first-order conditions on suitable neighborhoods.
  • The convex-program reduction could be reused for related estimation tasks such as log-concave regression or constrained density estimation.
  • Running the algorithm on synthetic data with known MLE would give a direct empirical check of the local-property assumption.
  • The technique might extend to streaming or distributed settings if the subgradient approximations can be maintained incrementally.

Load-bearing premise

The maximum likelihood log-concave distribution belongs to a class of structured distributions which, while not an exponential family, locally possesses key properties of exponential families.

What would settle it

On a concrete low-dimensional instance where the exact log-concave MLE can be computed by an independent method, the algorithm returns a distribution whose likelihood is more than ε worse than the true optimum.

Figures

Figures reproduced from arXiv: 1907.08306 by Alistair Stewart, Anastasios Sidiropoulos, Brian Axelrod, Gregory Valiant, Ilias Diakonikolas.

Figure 1
Figure 1. Figure 1: An example of a tent function and its corresponding regular subdivision. Notice that the [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Changing the height of the tent poles can change the induced regular subdivision (shown in [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
read the original abstract

We consider the problem of computing the maximum likelihood multivariate log-concave distribution for a set of points. Specifically, we present an algorithm which, given $n$ points in $\mathbb{R}^d$ and an accuracy parameter $\epsilon>0$, runs in time $poly(n,d,1/\epsilon),$ and returns a log-concave distribution which, with high probability, has the property that the likelihood of the $n$ points under the returned distribution is at most an additive $\epsilon$ less than the maximum likelihood that could be achieved via any log-concave distribution. This is the first computationally efficient (polynomial time) algorithm for this fundamental and practically important task. Our algorithm rests on a novel connection with exponential families: the maximum likelihood log-concave distribution belongs to a class of structured distributions which, while not an exponential family, "locally" possesses key properties of exponential families. This connection then allows the problem of computing the log-concave maximum likelihood distribution to be formulated as a convex optimization problem, and solved via an approximate first-order method. Efficiently approximating the (sub) gradients of the objective function of this optimization problem is quite delicate, and is the main technical challenge in this work.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The manuscript claims the first polynomial-time algorithm for approximating the maximum likelihood log-concave distribution: given n points in R^d and ε>0, it returns (w.h.p.) a log-concave distribution whose likelihood on the points is at most additive ε below the optimum, in time poly(n,d,1/ε). The approach rests on a novel reduction showing that the MLE belongs to a class of structured distributions that locally behave like exponential families, allowing formulation as a convex program solved by an approximate first-order method; the main technical hurdle is efficient approximation of the subgradients of the resulting objective.

Significance. If the central claim holds, the result is significant: it resolves a long-standing computational gap for a broad and practically relevant nonparametric class (log-concave distributions), supplies the first poly-time algorithm for their MLE, and introduces a reusable technique via locally exponential families that may apply to other structured distribution families. The explicit identification of subgradient approximation as the core difficulty is a useful contribution to the literature on computational statistics.

major comments (1)
  1. [Abstract] Abstract: the existence of a poly(n,d,1/ε)-time algorithm with additive ε guarantee is asserted, yet the text supplies no proof outline, no error analysis for the first-order method, and no concrete bounds or algorithm for the subgradient approximation that the abstract itself identifies as the main technical challenge; without these details the central claim cannot be verified.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their thoughtful review and for acknowledging the potential significance of the result. We address the single major comment below.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the existence of a poly(n,d,1/ε)-time algorithm with additive ε guarantee is asserted, yet the text supplies no proof outline, no error analysis for the first-order method, and no concrete bounds or algorithm for the subgradient approximation that the abstract itself identifies as the main technical challenge; without these details the central claim cannot be verified.

    Authors: We respectfully disagree. The abstract is a concise summary; the full manuscript supplies the requested elements. Section 2 defines locally exponential families, proves the MLE lies in this class, and reduces the problem to convex optimization. Section 3 states the convex program. Section 4 gives the first-order method together with its error analysis, convergence bounds, and overall approximation guarantee. Section 5 presents the subgradient approximation algorithm, including explicit runtime and accuracy bounds that establish the poly(n,d,1/ε) runtime. These sections collectively provide the proof outline, error analysis, and concrete subgradient procedure referenced in the abstract. revision: no

Circularity Check

0 steps flagged

No significant circularity; derivation self-contained via novel connection

full rationale

The paper derives a polynomial-time algorithm by establishing a novel connection between the MLE log-concave distribution and local exponential-family properties, which permits a convex optimization formulation solved by approximate first-order methods. This connection is introduced as original rather than derived from prior self-citations or fitted inputs; the subgradient approximation challenge is addressed directly from the stated local properties without reducing the target result to a definition or fit of itself. No self-definitional steps, load-bearing self-citations, or renamings of known results appear in the provided description, rendering the chain independent of its own outputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no concrete free parameters, axioms, or invented entities; ledger left empty pending full text.

pith-pipeline@v0.9.0 · 5762 in / 1061 out tokens · 25671 ms · 2026-05-24T19:09:25.346874+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

  • IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel echoes
    ?
    echoes

    ECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.

    the family {qy} is locally-exponential if ... (1) A(y) is convex in y, and (2) Ex∼qy[T(x)] ∈ ∂y A(y). ... tent distributions are in fact locally exponential (Lemma 1)

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

63 extracted references · 63 canonical work pages · 8 internal anchors

  1. [1]

    Acharya, I

    J. Acharya, I. Diakonikolas, C. Hegde, J. Li, and L. Schmidt. Fast and near-optimal algorithms for approximating distributions by histograms. In Proceedings of the 34th ACM Symposium on Principles of Database Systems, PODS 2015, pages 249–263, 2015

  2. [2]

    Sample-Optimal Density Estimation in Nearly-Linear Time

    J. Acharya, I. Diakonikolas, J. Li, and L. Schmidt. Sample-optimal density estimation in nearly- linear time. In Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2017, pages 1278–1289, 2017. Available at https://arxiv.org/abs/1506.00671

  3. [3]

    M. Y . An. Log-concave probability distributions: Theory and statistical testing. Technical Report Economics Working Paper Archive at WUSTL, Washington University at St. Louis, 1995

  4. [4]

    An Efficient Algorithm for High-Dimensional Log-Concave Maximum Likelihood

    B. Axelrod and G. Valiant. An efficient algorithm for high-dimensional log-concave maximum likelihood. CoRR, abs/1811.03204, 2018. URL http://arxiv.org/abs/1811.03204

  5. [5]

    Bagnoli and T

    M. Bagnoli and T. Bergstrom. Log-concave probability and its applications. Economic The- ory, 26(2):pp. 445–469, 2005. ISSN 09382259. URL http://www.jstor.org/stable/ 25055959

  6. [6]

    Balabdaoui and C

    F. Balabdaoui and C. R. Doss. Inference for a two-component mixture of symmetric distributions under log-concavity. Bernoulli, 24(2):1053–1071, 05 2018. doi: 10.3150/16-BEJ864

  7. [7]

    Balabdaoui and J

    F. Balabdaoui and J. A. Wellner. Estimation of a k-monotone density: Limit distribution theory and the spline connection. The Annals of Statistics, 35(6):pp. 2536–2564, 2007. ISSN 00905364

  8. [8]

    Balabdaoui and J

    F. Balabdaoui and J. A. Wellner. Estimation of a k-monotone density: characterizations, consistency and minimax lower bounds. Statistica Neerlandica, 64(1):45–70, 2010

  9. [9]

    Balabdaoui, K

    F. Balabdaoui, K. Rufibach, and J. A. Wellner. Limit distribution theory for maximum likelihood estimation of a log-concave density. The Annals of Statistics, 37(3):pp. 1299–1331, 2009. ISSN 00905364

  10. [10]

    Barlow, D.J

    R.E. Barlow, D.J. Bartholomew, J.M. Bremner, and H.D. Brunk. Statistical Inference under Order Restrictions. Wiley, New York, 1972

  11. [11]

    L. Birgé. Estimating a density under order restrictions: Nonasymptotic minimax risk. Annals of Statistics, 15(3):995–1012, 1987. 13

  12. [12]

    L. Birgé. On the risk of histograms for estimating decreasing densities. Annals of Statistics, 15 (3):1013–1022, 1987

  13. [13]

    S. P. Boyd and L. Vandenberghe.Convex Optimization. Cambridge University Press, Cambridge, UK, 2004. URL http://www.stanford.edu/~boyd/cvxbook.html

  14. [14]

    H. D. Brunk. On the estimation of parameters restricted by inequalities. The Annals of Mathematical Statistics, 29(2):pp. 437–454, 1958. ISSN 00034851

  15. [15]

    C. L. Canonne, I. Diakonikolas, T. Gouleakis, and R. Rubinfeld. Testing shape restrictions of discrete distributions. In STACS, pages 25:1–25:14, 2016

  16. [16]

    Carpenter, I

    T. Carpenter, I. Diakonikolas, A. Sidiropoulos, and A. Stewart. Near-optimal sample complexity bounds for maximum likelihood estimation of multivariate log-concave densities. InConference On Learning Theory, COLT 2018, pages 1234–1262, 2018. URL http://proceedings.mlr. press/v75/carpenter18a.html

  17. [17]

    Chan and H

    K.S. Chan and H. Tong. Testing for multimodality with dependent data. Biometrika, 91(1): 113–123, 2004

  18. [18]

    S. Chan, I. Diakonikolas, R. Servedio, and X. Sun. Learning mixtures of structured distributions over discrete domains. In SODA, pages 1380–1394, 2013

  19. [19]

    S. Chan, I. Diakonikolas, R. Servedio, and X. Sun. Efficient density estimation via piecewise polynomial approximation. In STOC, pages 604–613, 2014

  20. [20]

    S. Chan, I. Diakonikolas, R. Servedio, and X. Sun. Near-optimal density estimation in near-linear time using variable-width histograms. In NIPS, pages 1844–1852, 2014

  21. [21]

    Chen and R

    Y . Chen and R. J. Samworth. Smoothed log-concave maximum likelihood estimation with applications. Statist. Sinica, 23:1373–1398, 2013

  22. [22]

    M. Cule, R. Samworth, and M. Stewart. Maximum likelihood estimation of a multi-dimensional log-concave density. Journal of the Royal Statistical Society: Series B, 72:545–607, 2010

  23. [23]

    Dagan and G

    Y . Dagan and G. Kur. The log-concave maximum likelihood estimator is optimal in high dimensions. CoRR, abs/1903.05315, 2019. URL http://arxiv.org/abs/1903.05315

  24. [24]

    Daskalakis, I

    C. Daskalakis, I. Diakonikolas, and R.A. Servedio. Learningk-modal distributions via testing. In SODA, pages 1371–1385, 2012

  25. [25]

    Daskalakis, I

    C. Daskalakis, I. Diakonikolas, and R.A. Servedio. Learning Poisson Binomial Distributions. In STOC, pages 709–728, 2012

  26. [26]

    Daskalakis, I

    C. Daskalakis, I. Diakonikolas, R. O’Donnell, R.A. Servedio, and L. Tan. Learning Sums of Independent Integer Random Variables. In FOCS, pages 217–226, 2013

  27. [27]

    Daskalakis, A

    C. Daskalakis, A. De, G. Kamath, and C. Tzamos. A size-free CLT for poisson multinomials and its applications. In Proceedings of the 48th Annual ACM Symposium on the Theory of Computing, STOC ’16, 2016

  28. [28]

    A. De, P. M. Long, and R. A. Servedio. Density estimation for shift-invariant multidimensional distributions. CoRR, abs/1811.03744, 2018. URL http://arxiv.org/abs/1811.03744

  29. [29]

    Optimal Learning via the Fourier Transform for Sums of Independent Integer Random Variables

    I. Diakonikolas, D. M. Kane, and A. Stewart. Optimal learning via the fourier transform for sums of independent integer random variables. In Proceedings of the 29th Confer- ence on Learning Theory, COLT 2016 , pages 831–849, 2016. Full version available at https://arxiv.org/abs/1505.00662

  30. [30]

    Properly Learning Poisson Binomial Distributions in Almost Polynomial Time

    I. Diakonikolas, D. M. Kane, and A. Stewart. Properly learning poisson binomial distributions in almost polynomial time. In Proceedings of the 29th Conference on Learning Theory, COLT 2016, pages 850–878, 2016. Full version available at https://arxiv.org/abs/1511.04066

  31. [31]

    Diakonikolas, D

    I. Diakonikolas, D. M. Kane, and A. Stewart. The fourier transform of poisson multinomial distributions and its algorithmic applications. In Proceedings of STOC’16, 2016. 14

  32. [32]

    Diakonikolas, D

    I. Diakonikolas, D. M. Kane, and A. Stewart. Efficient Robust Proper Learning of Log-concave Distributions. Arxiv report, 2016

  33. [33]

    Diakonikolas, D

    I. Diakonikolas, D. M. Kane, and A. Stewart. Learning multivariate log-concave distributions. In Proceedings of the 30th Conference on Learning Theory, COLT 2017, pages 711–727, 2017. URL http://proceedings.mlr.press/v65/diakonikolas17a.html

  34. [34]

    Diakonikolas, J

    I. Diakonikolas, J. Li, and L. Schmidt. Fast and sample near-optimal algorithms for learning multidimensional histograms. In Conference On Learning Theory, COLT 2018, pages 819–842, 2018

  35. [35]

    A Polynomial Time Algorithm for Maximum Likelihood Estimation of Multivariate Log-concave Densities

    I. Diakonikolas, A. Sidiropoulos, and A. Stewart. A polynomial time algorithm for maximum likelihood estimation of multivariate log-concave densities. CoRR, abs/1812.05524, 2018. URL http://arxiv.org/abs/1812.05524

  36. [36]

    C. R. Doss and J. A. Wellner. Global rates of convergence of the mles of log-concave and s-concave densities. Ann. Statist., 44(3):954–981, 06 2016

  37. [37]

    J. C. Duchi. Introductory lectures on stochastic convex optimization. Park City Mathematics Institute, Graduate Summer School Lectures, 2016

  38. [38]

    Dumbgen and K

    L. Dumbgen and K. Rufibach. Maximum likelihood estimation of a log-concave density and its distribution function: Basic properties and uniform consistency. Bernoulli, 15(1):40–68, 2009

  39. [39]

    logcondens: Computations related to univariate log- concave density estimation

    Lutz Dümbgen and Kaspar Rufibach. logcondens: Computations related to univariate log- concave density estimation. Journal of Statistical Software, 39(6):1–28, 2011. URL http: //www.jstatsoft.org/v39/i06/

  40. [40]

    Fougères

    A.-L. Fougères. Estimation de densités unimodales. Canadian Journal of Statistics, 25:375–387, 1997

  41. [41]

    Gao and J

    F. Gao and J. A. Wellner. On the rate of convergence of the maximum likelihood estimator of a k-monotone density. Science in China Series A: Mathematics, 52:1525–1538, 2009

  42. [42]

    Grenander

    U. Grenander. On the theory of mortality measurement. Skand. Aktuarietidskr., 39:125–153, 1956

  43. [43]

    Groeneboom

    P. Groeneboom. Estimating a monotone density. In Proc. of the Berkeley Conference in Honor of Jerzy Neyman and Jack Kiefer, pages 539–555, 1985

  44. [44]

    Groeneboom and G

    P. Groeneboom and G. Jongbloed. Nonparametric Estimation under Shape Constraints: Esti- mators, Algorithms and Asymptotics. Cambridge University Press, 2014

  45. [45]

    Han and J

    Q. Han and J. A. Wellner. Approximation and estimation of s-concave densities via renyi divergences. Ann. Statist., 44(3):1332–1359, 06 2016

  46. [46]

    D. L. Hanson and G. Pledger. Consistency in concave regression. The Annals of Statistics, 4(6): pp. 1038–1050, 1976. ISSN 00905364

  47. [47]

    H. K. Jankowski and J. A. Wellner. Estimation of a discrete monotone density. Electronic Journal of Statistics, 3:1567–1605, 2009

  48. [48]

    Kannan, L

    R. Kannan, L. Lovász, and M. Simonovits. Random walks and an o*(n5) volume algorithm for convex bodies. Random Structures & Algorithms, 11(1):1–50, 1997

  49. [49]

    A. K. H. Kim and R. J. Samworth. Global rates of convergence in log-concave density estimation. Ann. Statist., 44(6):2756–2779, 12 2016. Available at http://arxiv.org/abs/1404.2298

  50. [50]

    Koenker and I

    R. Koenker and I. Mizera. Quasi-concave density estimation. Ann. Statist., 38(5):2998–3027, 2010

  51. [51]

    Lovász and S

    L. Lovász and S. Vempala. Fast algorithms for logconcave functions: Sampling, rounding, inte- gration and optimization. In 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS’06), pages 57–68. IEEE, 2006. 15

  52. [52]

    Lovász and S

    L. Lovász and S. Vempala. Hit-and-run from a corner. SIAM Journal on Computing, 35(4): 985–1005, 2006

  53. [53]

    Lovász and S

    L. Lovász and S. Vempala. Simulated annealing in convex bodies and an o*(n4) volume algorithm. Journal of Computer and System Sciences, 72(2):392–417, 2006

  54. [54]

    Lovász and S

    L. Lovász and S. Vempala. The geometry of logconcave functions and sampling algorithms. Random Structures and Algorithms, 30(3):307–358, 2007

  55. [55]

    Prakasa Rao

    B.L.S. Prakasa Rao. Estimation of a unimodal density. Sankhya Ser. A, 31:23–36, 1969

  56. [56]

    Fast Multivariate Log-Concave Density Estimation

    F. Rathke and C. Schnörr. Fast multivariate log-concave density estimation. CoRR, abs/1805.07272, 2018. URL https://arxiv.org/abs/1805.07272

  57. [57]

    Robeva, B

    E. Robeva, B. Sturmfels, and C. Uhler. Geometry of Log-Concave Density Estimation. ArXiv e-prints, 2017. Available at https://arxiv.org/abs/1704.01910

  58. [58]

    R. J. Samworth. Recent progress in log-concave density estimation. ArXiv e-prints, 2017

  59. [59]

    Saumard and J

    A. Saumard and J. A. Wellner. Log-concavity and strong log-concavity: A review. Statist. Surv., 8:45–114, 2014

  60. [60]

    R. P. Stanley. Log-concave and unimodal sequences in algebra, combinatorics, and geometry. Annals of the New York Academy of Sciences, 576(1):500–535, 1989. ISSN 1749-6632. doi: 10.1111/j.1749-6632.1989.tb16434.x. URL http://dx.doi.org/10.1111/j.1749-6632. 1989.tb16434.x

  61. [61]

    M. J. Wainwright, M. I. Jordan, et al. Graphical models, exponential families, and variational inference. Foundations and TrendsR⃝ in Machine Learning, 1(1–2):1–305, 2008

  62. [62]

    G. Walther. Inference and modeling with log-concave distributions. Stat. Science, 24:319–327, 2009

  63. [63]

    E.J. Wegman. Maximum likelihood estimation of a unimodal density. I. and II. Ann. Math. Statist., 41:457–471, 2169–2174, 1970. 16 Appendix A Introduction To Exponential Families In this section, we give a brief overview of exponential families that covers just the material necessary to appreciate the connection between exponential families and the log-con...