Randomized batch-sampling Kaczmarz methods for solving linear systems
Pith reviewed 2026-05-17 23:03 UTC · model grok-4.3
The pith
A unified randomized batch-sampling Kaczmarz framework yields scale-invariant expected linear convergence bounds for arbitrary static samplings.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We adopt a unified randomized batch-sampling Kaczmarz framework with per-iteration costs as low as cyclic block methods, and develop a general analysis technique to establish its convergence guarantee. With concentration inequalities, we derive new expected linear convergence rate bounds. The analysis applies to any randomized non-extended block Kaczmarz methods with arbitrary static stochastic samplings. In addition, the new rate bounds are scale-invariant, which eliminate the dependence on the magnitude of the data matrix.
What carries the argument
Unified randomized batch-sampling Kaczmarz framework together with concentration inequalities applied to arbitrary static stochastic samplings.
If this is right
- The new bounds are significantly tighter than existing ones in most experiments and better reflect the empirical convergence behavior of block methods.
- The batch-sampling distribution can be treated as a learnable parameter to achieve efficient performance in specific application scenarios.
- Convergence guarantees hold for arbitrary static stochastic samplings without further restrictions on the matrix.
- Scale invariance removes any dependence on the magnitude of the data matrix.
Where Pith is reading between the lines
- The framework could support development of adaptive or data-driven sampling distributions beyond the static case considered here.
- Scale-invariant bounds may simplify use on problems where data normalization varies across instances or iterations.
- Similar concentration-based arguments might transfer to other classes of randomized iterative solvers for linear systems.
- The low per-iteration cost property suggests direct applicability to very large-scale systems where block size must remain modest.
Load-bearing premise
Concentration inequalities can be applied directly to produce tight, scale-invariant bounds that hold for arbitrary static stochastic samplings without additional restrictions on the matrix or sampling distribution.
What would settle it
Finding a matrix and static sampling distribution for which the derived rate bound is violated or loses scale invariance would falsify the claim.
read the original abstract
To conduct a more in-depth investigation of randomized solvers for solving linear systems, we adopt a unified randomized batch-sampling Kaczmarz framework with per-iteration costs as low as cyclic block methods, and develop a general analysis technique to establish its convergence guarantee. With concentration inequalities, we derive new expected linear convergence rate bounds. The analysis applies to any randomized non-extended block Kaczmarz methods with arbitrary static stochastic samplings. In addition, the new rate bounds are scale-invariant, which eliminate the dependence on the magnitude of the data matrix. In most experiments, the new bounds are significantly tighter than existing ones and better reflect the empirical convergence behavior of block methods. Within this new framework, the batch-sampling distribution, as a learnable parameter, provides the possibility for block methods to achieve efficient performance in specific application scenarios, which deserves further investigation.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces a unified randomized batch-sampling Kaczmarz framework for solving linear systems with per-iteration costs comparable to cyclic block methods. It develops a general analysis technique that applies concentration inequalities to derive new expected linear convergence rate bounds. These bounds are claimed to hold for any randomized non-extended block Kaczmarz method under arbitrary static stochastic samplings and to be scale-invariant (independent of the magnitude of the data matrix A). Experiments indicate the new bounds are tighter than prior results and better match observed convergence behavior. The framework also treats the batch-sampling distribution as a learnable parameter for application-specific optimization.
Significance. If the analysis is correct, the work would supply a flexible, general-purpose tool for obtaining convergence guarantees across a wide family of block Kaczmarz variants while removing explicit dependence on matrix scaling. The ability to treat sampling distributions as tunable parameters could also enable practical performance gains in targeted applications. These features would be of interest to the numerical linear algebra community working on randomized iterative solvers.
major comments (2)
- [Analysis section (derivation of the convergence bound)] The central claim that concentration inequalities directly produce tight, scale-invariant expected linear rates for arbitrary static samplings (without extra restrictions on A or the sampling law) is load-bearing. The derivation must be shown to eliminate all dependence on row norms or matrix scaling factors; otherwise the scale-invariance and generality assertions do not hold. This issue is not resolved by the high-level description alone.
- [Analysis section (application of concentration inequalities)] The manuscript provides no explicit verification that the random variables to which concentration inequalities are applied satisfy the required boundedness or sub-Gaussian conditions independently of scaling. If these conditions implicitly reintroduce dependence on ||A|| or row norms, the claimed generality fails.
minor comments (2)
- [Introduction / Framework] Clarify the precise definition of 'non-extended block Kaczmarz' and 'static stochastic sampling' early in the paper to avoid ambiguity for readers unfamiliar with the subfield.
- [Numerical experiments] The experimental section would benefit from explicit reporting of matrix scaling factors used in the test problems to directly illustrate the scale-invariance property.
Simulated Author's Rebuttal
We thank the referee for the careful and constructive review of our manuscript. The comments on the analysis section are well-taken, and we have revised the paper to provide a more explicit and detailed derivation along with the requested verifications. Below we respond point by point to the major comments.
read point-by-point responses
-
Referee: [Analysis section (derivation of the convergence bound)] The central claim that concentration inequalities directly produce tight, scale-invariant expected linear rates for arbitrary static samplings (without extra restrictions on A or the sampling law) is load-bearing. The derivation must be shown to eliminate all dependence on row norms or matrix scaling factors; otherwise the scale-invariance and generality assertions do not hold. This issue is not resolved by the high-level description alone.
Authors: We agree that a high-level description is insufficient and that the derivation must be shown explicitly. In the revised manuscript we have expanded the analysis section with a complete step-by-step derivation. The key step is to apply the concentration inequalities to suitably normalized random variables whose moments are expressed solely in terms of the sampling probabilities and the normalized rows of A; all absolute scaling factors cancel in the resulting expectation. We have added a new lemma that isolates this cancellation and an appendix containing the full expanded proof. These changes directly address the load-bearing claim. revision: yes
-
Referee: [Analysis section (application of concentration inequalities)] The manuscript provides no explicit verification that the random variables to which concentration inequalities are applied satisfy the required boundedness or sub-Gaussian conditions independently of scaling. If these conditions implicitly reintroduce dependence on ||A|| or row norms, the claimed generality fails.
Authors: We acknowledge that the original manuscript lacked an explicit verification of the boundedness/sub-Gaussian conditions. In the revision we have inserted a dedicated paragraph immediately following the statement of the concentration inequality. There we verify that the relevant random variables are bounded by quantities that depend only on the sampling distribution and the normalized matrix (i.e., rows scaled by their Euclidean norms). Because the normalization is performed inside the random variable itself, the bound remains invariant under global scaling of A. This verification is now stated as a separate proposition to make the independence from ||A|| transparent. revision: yes
Circularity Check
No circularity: derivation applies standard concentration inequalities to a general framework
full rationale
The paper introduces a unified randomized batch-sampling Kaczmarz framework and develops a general analysis technique that invokes concentration inequalities to obtain expected linear convergence rates. These rates are stated to hold for arbitrary static stochastic samplings and to be scale-invariant by construction of the bounds rather than by fitting or self-definition. No load-bearing self-citations, uniqueness theorems from prior author work, or renamings of known results are indicated in the abstract or description; the central claims rest on applying external probabilistic tools to the iterates without reducing to the inputs by definition. The analysis is presented as independent of specific matrix restrictions beyond the framework itself.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Concentration inequalities apply to the random sampling process in the Kaczmarz iteration
Reference graph
Works this paper leans on
-
[1]
P. C. Hansen, J. Jørgensen, W. R. B. Lionheart,Computed Tomography: Algorithms, Insight, and Just Enough Theory, Society for Industrial and Applied Mathematics, Philadelphia, 2021
work page 2021
-
[2]
M. A. Olshanskii, E. E. Tyrtyshnikov,Iterative Methods for Linear Systems: Theory and Applications, Society for Industrial and Applied Mathematics, Philadelphia, 2014
work page 2014
-
[3]
C. Byrne, A unified treatment of some iterative algorithms in signal processing and image reconstruction, Inverse Probl., 20(1) (2003), pp. 103–120
work page 2003
-
[4]
C. Popa, R. Zdunek, Kaczmarz extended algorithm for tomographic image reconstruction from limited data, Math. Comput. Simulation, 65 (2004), pp. 579–598
work page 2004
- [5]
-
[6]
A. Patrascu, I. Necoara, Nonasymptotic convergence of stochastic proximal point methods for constrained convex optimization, J. Mach. Learn. Res., 18(198) (2018), pp. 1–42
work page 2018
-
[7]
Kaczmarz, Angen¨ aherte Aufl¨ osung von Systemen linearer Gleichungen, Bull
S. Kaczmarz, Angen¨ aherte Aufl¨ osung von Systemen linearer Gleichungen, Bull. Int. Acad. Pol. Sci. Lett. A, 35 (1937), pp. 355–357
work page 1937
-
[8]
T. Strohmer, R. Vershynin, A randomized Kaczmarz algorithm with exponential convergence, J. Fourier Anal. Appl., 15 (2009), pp. 262–278
work page 2009
- [9]
- [10]
- [11]
-
[12]
Z.-Z. Bai, L. Wang, On convergence rates of Kaczmarz-type methods with different selection rules of working rows, Appl. Numer. Math., 186 (2023), pp. 289–319. 29
work page 2023
- [13]
- [14]
-
[15]
A. Zouzias, N.M. Freris, Randomized extended Kaczmarz for solving least squares, SIAM J. Matrix Anal. Appl., 34 (2013), pp. 773–793
work page 2013
-
[16]
A. Ma, D. Needell, A. Ramdas, Convergence properties of the randomized extended Gauss-Seidel and Kaczmarz methods, SIAM J. Matrix Anal. Appl., 36 (2015), pp. 1590–1604
work page 2015
- [17]
-
[18]
S. Shalev-Shwartz, T. Zhang, Accelerated mini-batch stochastic dual coordinate ascent, Adv. Neural Inf. Process. Syst., 26 (2013), pp. 1–8
work page 2013
- [19]
-
[20]
D. Needell, J. Tropp, Paved with good intentions: Analysis of a randomized block Kaczmarz method, Linear Algebra Appl., 441 (2014), pp. 199–221
work page 2014
-
[21]
R. M. Gower, P. Richt´ arik, Randomized iterative methods for linear systems, SIAM J. Matrix Anal. Appl., 36(4) (2015), pp. 1660–1690
work page 2015
-
[22]
Necoara, Faster randomized block Kaczmarz algorithms, SIAM J
I. Necoara, Faster randomized block Kaczmarz algorithms, SIAM J. Matrix Anal. Appl., 40 (2019), pp. 1425–1452
work page 2019
-
[23]
J.-Q. Chen, Z.-D. Huang, On a fast deterministic block Kaczmarz method for solving large-scale linear systems, Numer. Algorithms, 89(3) (2022), pp. 1007–1029
work page 2022
- [24]
- [25]
-
[26]
R. M. Gower, D. Molitor, J. Moorman, and D. Needell, On adaptive sketch-and-project for solving linear systems, SIAM J. Matrix Anal. Appl., 42 (2021), pp. 954–989
work page 2021
-
[27]
D. P. Woodruff,Sketching as a Tool for Numerical Linear Algebra, Now Foundations and Trends, 2014
work page 2014
-
[28]
P. G. Martinsson, J. A. Tropp, Randomized numerical linear algebra: Foundations and algorithms, Acta Numer., 29 (2020), pp. 403–572
work page 2020
- [29]
-
[30]
A. Kireeva, J. A. Tropp, Randomized matrix computations: Themes and variations, arXiv preprint arXiv:2402.17873, 2024
-
[31]
M. Derezi´ nski, E. Rebrova, Sharp analysis of sketch-and-project methods via a connection to randomized singular value decomposition, SIAM J. Math. Data Sci., 6 (2024), pp. 127–153
work page 2024
-
[32]
P. J. Denning, The locality principle, Commun. ACM, 48(7) (2005), pp. 19–24
work page 2005
-
[33]
T. A. Davis, Y. Hu, The university of Florida sparse matrix collection, ACM Trans Math Softw., 38(1) (2011), pp. 1–25
work page 2011
-
[34]
R. Vershynin,High-Dimensional Probability: An Introduction with Applications in Data Science, Cambridge Univer- sity Press, Cambridge, 2018
work page 2018
-
[35]
S. Boucheron, G. Lugosi, P. Massart,Concentration Inequalities: A Nonasymptotic Theory of Independence, Oxford University Press, Oxford, 2013
work page 2013
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.