Computing Lewis weights to high precision using local relative smoothness
Pith reviewed 2026-06-30 02:35 UTC · model grok-4.3
The pith
Algorithms compute ε-approximations to ℓ_p-Lewis weights for p ≥ 4 in O(p² log(m/ε)) leverage-score rounds.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By applying a local variant of relatively smooth gradient descent to the primal and dual ℓ_p-Lewis weight optimization problems and providing tools to convert between different notions of approximate ℓ_p-Lewis weights, one obtains ε-estimates using only O(p² log(m/ε)) rounds of leverage score computation for any p ≥ 4.
What carries the argument
Local variant of relatively smooth gradient descent applied to primal and dual ℓ_p-Lewis weight problems together with approximation-conversion tools.
If this is right
- Any downstream algorithm whose running time is dominated by the number of Lewis-weight rounds can be sped up by a factor of roughly p.
- Leverage-score oracles can be treated as black-box subroutines inside the new framework.
- The same round bound holds uniformly for all p ≥ 4 rather than requiring separate analyses per regime.
- High-precision Lewis weights become available at the same asymptotic cost as constant-factor approximations in the p-dependent term.
Where Pith is reading between the lines
- The technique may extend to 2 < p < 4 if suitable local smoothness parameters can be established for those regimes.
- Because the method separates the smoothness analysis from the oracle model, it could be combined with faster leverage-score algorithms developed for other settings.
- Practical implementations could replace exact leverage scores with approximate ones inside each descent step while still preserving the overall round bound.
- The conversion tools between primal and dual approximations might apply to other row-sampling problems that admit both primal and dual formulations.
Load-bearing premise
The local relative-smoothness descent method applies to both the primal and dual problems and the supplied conversion rules between approximate solutions preserve the claimed round complexity.
What would settle it
An explicit matrix and value of p ≥ 4 for which any sequence of local relatively-smooth steps that reach ε-accuracy requires Ω(p³) leverage-score rounds, or for which the conversion lemmas between approximate Lewis weights inflate the total round count beyond O(p² log(m/ε)).
read the original abstract
We provide algorithms that compute $\epsilon$-estimates of the $\ell_p$-Lewis weights of a matrix $A \in \mathbb{R}^{m \times n}$ for $p \geq 4$ using $O(p^2 \log(m/\epsilon))$ rounds of leverage score computation, where $\ell_p$-Lewis weights and leverage scores are both standard measures of row importance. This improves upon the state-of-the-art round complexity of $O(p^3 \log(m/\epsilon))$ due to Fazel, Lee, Padmanabha, and Sidford (2022). We obtain our results by carefully applying a local variant of relatively smooth gradient descent to primal and dual forms of the $\ell_p$-Lewis weight optimization problem and providing tools to convert between different notions of approximate $\ell_p$-Lewis weights.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims algorithms to compute ε-approximations to the ℓ_p-Lewis weights of an m×n matrix A for p≥4 in O(p² log(m/ε)) rounds of leverage-score computation. This improves the prior O(p³ log(m/ε)) bound of Fazel et al. (2022). The improvement is obtained by applying a local variant of relatively smooth gradient descent to the primal and dual ℓ_p-Lewis weight optimization problems together with conversion tools between additive/multiplicative and primal/dual notions of approximation.
Significance. If correct, the quadratic improvement in round complexity is a meaningful advance for a core primitive in randomized numerical linear algebra, with potential impact on sampling, preconditioning, and downstream optimization routines. The technical device of local relative smoothness is presented as the key enabler and is credited explicitly as the source of the p-factor saving; the conversion lemmas are also credited for preserving the improved bound without additional p-multiplicative overhead.
major comments (2)
- [§4.2, Theorem 4.3] §4.2, Theorem 4.3 and the surrounding analysis of local relative smoothness: the claimed O(p log(1/ε)) iteration bound for the local GD variant on the primal and dual problems is load-bearing for the overall O(p²) claim. The local smoothness parameter is stated to be O(p), but the proof must explicitly show that the local Lipschitz constant does not re-introduce an extra linear p factor from the global analysis; otherwise the total round count reverts to the prior cubic bound.
- [§5, Lemmas 5.1–5.4] §5, Lemmas 5.1–5.4 (conversion tools): the total overhead of the additive-to-multiplicative and primal-to-dual conversions is asserted to be only O(log(m/ε)) rounds. These lemmas are load-bearing; each must be checked to ensure that the error translation does not incur a hidden O(p) factor that would accumulate across the O(p log(1/ε)) iterations and cancel the quadratic improvement.
minor comments (2)
- [Introduction] The abstract states the complexity result but the introduction should include a one-paragraph comparison table or explicit statement of how the new local analysis differs from the global p³ analysis of Fazel et al.
- Notation for the local smoothness parameter (e.g., L_local) should be introduced once and used consistently in all statements of the iteration bound.
Simulated Author's Rebuttal
We thank the referee for their careful reading and constructive feedback. Below we respond point-by-point to the major comments, clarifying the relevant analyses in the manuscript.
read point-by-point responses
-
Referee: [§4.2, Theorem 4.3] §4.2, Theorem 4.3 and the surrounding analysis of local relative smoothness: the claimed O(p log(1/ε)) iteration bound for the local GD variant on the primal and dual problems is load-bearing for the overall O(p²) claim. The local smoothness parameter is stated to be O(p), but the proof must explicitly show that the local Lipschitz constant does not re-introduce an extra linear p factor from the global analysis; otherwise the total round count reverts to the prior cubic bound.
Authors: Theorem 4.3 and its proof derive the local relative smoothness parameter directly from the Hessian of the ℓ_p-Lewis objective restricted to a local ball whose radius is chosen proportionally to the current gradient norm. Within this ball the higher-order terms that produce extra p factors in the global smoothness analysis are controlled by the relative-smoothness Bregman divergence, so the local Lipschitz constant remains O(p) with no additional linear p multiplier. The standard relatively-smooth GD convergence lemma then yields the stated O(p log(1/ε)) iteration bound without reverting to the prior cubic dependence. revision: no
-
Referee: [§5, Lemmas 5.1–5.4] §5, Lemmas 5.1–5.4 (conversion tools): the total overhead of the additive-to-multiplicative and primal-to-dual conversions is asserted to be only O(log(m/ε)) rounds. These lemmas are load-bearing; each must be checked to ensure that the error translation does not incur a hidden O(p) factor that would accumulate across the O(p log(1/ε)) iterations and cancel the quadratic improvement.
Authors: Lemmas 5.1–5.4 establish conversion factors whose multiplicative constants are independent of p (they depend only on the fixed regime p ≥ 4 and on the leverage-score approximation quality). Consequently each conversion step contributes only an O(log(m/ε)) round overhead whose p-dependence is absorbed into the existing O(p) term per phase; no additional p factor is introduced that would accumulate across iterations and restore a cubic bound. revision: no
Circularity Check
No circularity: algorithmic analysis derives O(p²) bound from local smoothness and conversion lemmas
full rationale
The paper derives the improved O(p² log(m/ε)) round complexity via explicit analysis of a local variant of relatively smooth gradient descent applied to the primal/dual ℓ_p-Lewis weight problems (p≥4), combined with conversion tools between approximate notions. This is a direct mathematical argument on convergence rates and overhead, not a reduction to fitted parameters, self-definitions, or prior self-citations as load-bearing premises. The cited O(p³) baseline from Fazel et al. (2022) serves only as comparison, not as an unverified uniqueness theorem or ansatz. No equations or steps in the provided abstract or description reduce the claimed result to its inputs by construction; the derivation is self-contained against external algorithmic benchmarks.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
, title =
Wojtaszczyk, P. , title =. 1991 , address =
1991
-
[2]
Khachiyan , journal =
Leonid G. Khachiyan , journal =. Rounding of Polytopes in the Real Number Model of Computation , volume =
-
[3]
Studies and Essays Presented to R
Fritz, John , title =. Studies and Essays Presented to R. Courant on his 60th Birthday , publisher =
-
[4]
Coordinate Methods for Accelerating _ Regression and Faster Approximate Maximum Flow , year=
Sidford, Aaron and Tian, Kevin , booktitle=. Coordinate Methods for Accelerating _ Regression and Faster Approximate Maximum Flow , year=
-
[5]
Proceedings of the 37th International Conference on Machine Learning , articleno =
Malitsky, Yura and Mishchenko, Konstantin , title =. Proceedings of the 37th International Conference on Machine Learning , articleno =. 2020 , publisher =
2020
-
[6]
Mathematical Programming , volume =
Adaptive proximal algorithms for convex optimization under local Lipschitz continuity of the gradient , author =. Mathematical Programming , volume =. 2025 , doi =
2025
-
[7]
and Bolte, J\'
Bauschke, Heinz H. and Bolte, J\'. A Descent Lemma Beyond Lipschitz Gradient Continuity: First-Order Methods Revisited and Applications , journal =. 2017 , doi =
2017
-
[8]
Liu and Thatchaphol Saranurak and Aaron Sidford and Zhao Song and Di Wang , title =
Jan van den Brand and Yin Tat Lee and Yang P. Liu and Thatchaphol Saranurak and Aaron Sidford and Zhao Song and Di Wang , title =. Proceedings of the fifty-third annual ACM symposium on Theory of Computing , publisher =
-
[9]
Liu and Aaron Sidford , title =
Arun Jambulapati and Yang P. Liu and Aaron Sidford , title =. Proceedings of the fifty-fourth annual ACM symposium on Theory of Computing , publisher =
-
[10]
Bourgain and J
J. Bourgain and J. Lindenstrauss and V. Milman , title =. Acta Mathematica , publisher =. 1989 , doi =
1989
-
[11]
Proceedings of the 2022 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) , chapter =
Maryam Fazel and Yin Tat Lee and Swati Padmanabhan and Aaron Sidford , title =. Proceedings of the 2022 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) , chapter =. doi:10.1137/1.9781611977073.107 , URL =
-
[12]
, journal =
Lewis, D. , journal =
-
[13]
2016 , school =
Yin Tat Lee , title =. 2016 , school =
2016
-
[14]
and Peng, Richard , booktitle=
Cohen, Michael B. and Peng, Richard , booktitle=. _p row sampling by
-
[15]
The Journal of Machine Learning Research , volume=
Fast approximation of matrix coherence and statistical leverage , author=. The Journal of Machine Learning Research , volume=. 2012 , publisher=
2012
-
[16]
Journal of the ACM (JACM) , volume=
Sampling from large matrices: An approach through geometric functional analysis , author=. Journal of the ACM (JACM) , volume=. 2007 , publisher=
2007
-
[17]
Woodruff and Taisuke Yasuda , title =
David P. Woodruff and Taisuke Yasuda , title =. Proceedings of the 2023 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) , chapter =. doi:10.1137/1.9781611977554.ch175 , URL =. https://epubs.siam.org/doi/pdf/10.1137/1.9781611977554.ch175 , year =
-
[18]
SIAM Journal on Computing , volume =
Apers, Simon and Gribling, Sander , title =. SIAM Journal on Computing , volume =. 2026 , URL =
2026
-
[19]
Apers, Simon and Gribling, Sander and Sidford, Aaron , journal=
-
[20]
Proceedings of the fortieth annual ACM symposium on Theory of Computing , pages=
Graph sparsification by effective resistances , author=. Proceedings of the fortieth annual ACM symposium on Theory of Computing , pages=
-
[21]
Solving linear programs with
Lee, Yin Tat and Sidford, Aaron , journal=. Solving linear programs with
-
[22]
2016 , publisher=
Minimum-volume ellipsoids: Theory and algorithms , author=. 2016 , publisher=
2016
-
[23]
Talagrand, Michel , journal=
-
[24]
SIAM Journal on Optimization , volume=
Relatively smooth convex optimization by first-order methods, and applications , author=. SIAM Journal on Optimization , volume=. 2018 , publisher=
2018
-
[25]
On accelerated proximal gradient methods for convex-concave optimization , author=
-
[26]
Journal of the ACM (JACM) , volume=
Low-rank approximation and regression in input sparsity time , author=. Journal of the ACM (JACM) , volume=. 2017 , publisher=
2017
-
[27]
A general convergence result for mirror descent with Armijo line search , author=
-
[28]
SIAM Journal on Imaging Sciences , volume=
Provable phase retrieval with mirror descent , author=. SIAM Journal on Imaging Sciences , volume=. 2023 , publisher=
2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.