pith. sign in

arxiv: 2511.09872 · v2 · submitted 2025-11-13 · 🧮 math.NA · cs.NA· stat.ML

Randomized batch-sampling Kaczmarz methods for solving linear systems

Pith reviewed 2026-05-17 23:03 UTC · model grok-4.3

classification 🧮 math.NA cs.NAstat.ML
keywords randomized Kaczmarz methodsbatch samplinglinear systemsconvergence analysisconcentration inequalitiesscale-invariant boundsblock methodsstochastic sampling
0
0 comments X

The pith

A unified randomized batch-sampling Kaczmarz framework yields scale-invariant expected linear convergence bounds for arbitrary static samplings.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a unified randomized batch-sampling Kaczmarz framework that keeps per-iteration costs as low as those of cyclic block methods while supplying a general analysis technique. Concentration inequalities are used to obtain new expected linear convergence rate bounds that apply to any randomized non-extended block Kaczmarz method equipped with arbitrary static stochastic samplings. These bounds are scale-invariant and therefore independent of the magnitude of the data matrix; experiments indicate they are tighter than prior bounds and track observed behavior more closely. The framework also treats the batch-sampling distribution itself as a learnable parameter that can be tuned for particular applications.

Core claim

We adopt a unified randomized batch-sampling Kaczmarz framework with per-iteration costs as low as cyclic block methods, and develop a general analysis technique to establish its convergence guarantee. With concentration inequalities, we derive new expected linear convergence rate bounds. The analysis applies to any randomized non-extended block Kaczmarz methods with arbitrary static stochastic samplings. In addition, the new rate bounds are scale-invariant, which eliminate the dependence on the magnitude of the data matrix.

What carries the argument

Unified randomized batch-sampling Kaczmarz framework together with concentration inequalities applied to arbitrary static stochastic samplings.

If this is right

  • The new bounds are significantly tighter than existing ones in most experiments and better reflect the empirical convergence behavior of block methods.
  • The batch-sampling distribution can be treated as a learnable parameter to achieve efficient performance in specific application scenarios.
  • Convergence guarantees hold for arbitrary static stochastic samplings without further restrictions on the matrix.
  • Scale invariance removes any dependence on the magnitude of the data matrix.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The framework could support development of adaptive or data-driven sampling distributions beyond the static case considered here.
  • Scale-invariant bounds may simplify use on problems where data normalization varies across instances or iterations.
  • Similar concentration-based arguments might transfer to other classes of randomized iterative solvers for linear systems.
  • The low per-iteration cost property suggests direct applicability to very large-scale systems where block size must remain modest.

Load-bearing premise

Concentration inequalities can be applied directly to produce tight, scale-invariant bounds that hold for arbitrary static stochastic samplings without additional restrictions on the matrix or sampling distribution.

What would settle it

Finding a matrix and static sampling distribution for which the derived rate bound is violated or loses scale invariance would falsify the claim.

read the original abstract

To conduct a more in-depth investigation of randomized solvers for solving linear systems, we adopt a unified randomized batch-sampling Kaczmarz framework with per-iteration costs as low as cyclic block methods, and develop a general analysis technique to establish its convergence guarantee. With concentration inequalities, we derive new expected linear convergence rate bounds. The analysis applies to any randomized non-extended block Kaczmarz methods with arbitrary static stochastic samplings. In addition, the new rate bounds are scale-invariant, which eliminate the dependence on the magnitude of the data matrix. In most experiments, the new bounds are significantly tighter than existing ones and better reflect the empirical convergence behavior of block methods. Within this new framework, the batch-sampling distribution, as a learnable parameter, provides the possibility for block methods to achieve efficient performance in specific application scenarios, which deserves further investigation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces a unified randomized batch-sampling Kaczmarz framework for solving linear systems with per-iteration costs comparable to cyclic block methods. It develops a general analysis technique that applies concentration inequalities to derive new expected linear convergence rate bounds. These bounds are claimed to hold for any randomized non-extended block Kaczmarz method under arbitrary static stochastic samplings and to be scale-invariant (independent of the magnitude of the data matrix A). Experiments indicate the new bounds are tighter than prior results and better match observed convergence behavior. The framework also treats the batch-sampling distribution as a learnable parameter for application-specific optimization.

Significance. If the analysis is correct, the work would supply a flexible, general-purpose tool for obtaining convergence guarantees across a wide family of block Kaczmarz variants while removing explicit dependence on matrix scaling. The ability to treat sampling distributions as tunable parameters could also enable practical performance gains in targeted applications. These features would be of interest to the numerical linear algebra community working on randomized iterative solvers.

major comments (2)
  1. [Analysis section (derivation of the convergence bound)] The central claim that concentration inequalities directly produce tight, scale-invariant expected linear rates for arbitrary static samplings (without extra restrictions on A or the sampling law) is load-bearing. The derivation must be shown to eliminate all dependence on row norms or matrix scaling factors; otherwise the scale-invariance and generality assertions do not hold. This issue is not resolved by the high-level description alone.
  2. [Analysis section (application of concentration inequalities)] The manuscript provides no explicit verification that the random variables to which concentration inequalities are applied satisfy the required boundedness or sub-Gaussian conditions independently of scaling. If these conditions implicitly reintroduce dependence on ||A|| or row norms, the claimed generality fails.
minor comments (2)
  1. [Introduction / Framework] Clarify the precise definition of 'non-extended block Kaczmarz' and 'static stochastic sampling' early in the paper to avoid ambiguity for readers unfamiliar with the subfield.
  2. [Numerical experiments] The experimental section would benefit from explicit reporting of matrix scaling factors used in the test problems to directly illustrate the scale-invariance property.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful and constructive review of our manuscript. The comments on the analysis section are well-taken, and we have revised the paper to provide a more explicit and detailed derivation along with the requested verifications. Below we respond point by point to the major comments.

read point-by-point responses
  1. Referee: [Analysis section (derivation of the convergence bound)] The central claim that concentration inequalities directly produce tight, scale-invariant expected linear rates for arbitrary static samplings (without extra restrictions on A or the sampling law) is load-bearing. The derivation must be shown to eliminate all dependence on row norms or matrix scaling factors; otherwise the scale-invariance and generality assertions do not hold. This issue is not resolved by the high-level description alone.

    Authors: We agree that a high-level description is insufficient and that the derivation must be shown explicitly. In the revised manuscript we have expanded the analysis section with a complete step-by-step derivation. The key step is to apply the concentration inequalities to suitably normalized random variables whose moments are expressed solely in terms of the sampling probabilities and the normalized rows of A; all absolute scaling factors cancel in the resulting expectation. We have added a new lemma that isolates this cancellation and an appendix containing the full expanded proof. These changes directly address the load-bearing claim. revision: yes

  2. Referee: [Analysis section (application of concentration inequalities)] The manuscript provides no explicit verification that the random variables to which concentration inequalities are applied satisfy the required boundedness or sub-Gaussian conditions independently of scaling. If these conditions implicitly reintroduce dependence on ||A|| or row norms, the claimed generality fails.

    Authors: We acknowledge that the original manuscript lacked an explicit verification of the boundedness/sub-Gaussian conditions. In the revision we have inserted a dedicated paragraph immediately following the statement of the concentration inequality. There we verify that the relevant random variables are bounded by quantities that depend only on the sampling distribution and the normalized matrix (i.e., rows scaled by their Euclidean norms). Because the normalization is performed inside the random variable itself, the bound remains invariant under global scaling of A. This verification is now stated as a separate proposition to make the independence from ||A|| transparent. revision: yes

Circularity Check

0 steps flagged

No circularity: derivation applies standard concentration inequalities to a general framework

full rationale

The paper introduces a unified randomized batch-sampling Kaczmarz framework and develops a general analysis technique that invokes concentration inequalities to obtain expected linear convergence rates. These rates are stated to hold for arbitrary static stochastic samplings and to be scale-invariant by construction of the bounds rather than by fitting or self-definition. No load-bearing self-citations, uniqueness theorems from prior author work, or renamings of known results are indicated in the abstract or description; the central claims rest on applying external probabilistic tools to the iterates without reducing to the inputs by definition. The analysis is presented as independent of specific matrix restrictions beyond the framework itself.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claims rest on standard probabilistic concentration tools and the assumption that the framework covers arbitrary static samplings without introducing new fitted constants or entities.

axioms (1)
  • domain assumption Concentration inequalities apply to the random sampling process in the Kaczmarz iteration
    Invoked to derive the expected linear convergence rate bounds

pith-pipeline@v0.9.0 · 5445 in / 1239 out tokens · 35574 ms · 2026-05-17T23:03:48.424489+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

35 extracted references · 35 canonical work pages

  1. [1]

    P. C. Hansen, J. Jørgensen, W. R. B. Lionheart,Computed Tomography: Algorithms, Insight, and Just Enough Theory, Society for Industrial and Applied Mathematics, Philadelphia, 2021

  2. [2]

    M. A. Olshanskii, E. E. Tyrtyshnikov,Iterative Methods for Linear Systems: Theory and Applications, Society for Industrial and Applied Mathematics, Philadelphia, 2014

  3. [3]

    Byrne, A unified treatment of some iterative algorithms in signal processing and image reconstruction, Inverse Probl., 20(1) (2003), pp

    C. Byrne, A unified treatment of some iterative algorithms in signal processing and image reconstruction, Inverse Probl., 20(1) (2003), pp. 103–120

  4. [4]

    C. Popa, R. Zdunek, Kaczmarz extended algorithm for tomographic image reconstruction from limited data, Math. Comput. Simulation, 65 (2004), pp. 579–598

  5. [5]

    Chang, C

    K.-W. Chang, C. J. Hsieh, C.-J. Lin, Coordinate descent method for large-scalel 2-loss linear support vector machines, J. Mach. Learn. Res., 9(45) (2008), pp. 1369–1398

  6. [6]

    Patrascu, I

    A. Patrascu, I. Necoara, Nonasymptotic convergence of stochastic proximal point methods for constrained convex optimization, J. Mach. Learn. Res., 18(198) (2018), pp. 1–42

  7. [7]

    Kaczmarz, Angen¨ aherte Aufl¨ osung von Systemen linearer Gleichungen, Bull

    S. Kaczmarz, Angen¨ aherte Aufl¨ osung von Systemen linearer Gleichungen, Bull. Int. Acad. Pol. Sci. Lett. A, 35 (1937), pp. 355–357

  8. [8]

    Strohmer, R

    T. Strohmer, R. Vershynin, A randomized Kaczmarz algorithm with exponential convergence, J. Fourier Anal. Appl., 15 (2009), pp. 262–278

  9. [9]

    Bai, W.-T

    Z.-Z. Bai, W.-T. Wu, On greedy randomized Kaczmarz method for solving large sparse linear systems, SIAM J. Sci. Comput., 40 (2018) pp. A592–A606

  10. [10]

    Bai, W.-T

    Z.-Z. Bai, W.-T. Wu, On relaxed greedy randomized Kaczmarz methods for solving large sparse linear systems, Appl. Math. Lett., 83 (2018), pp. 21–26

  11. [11]

    Bai, W.-T

    Z.-Z. Bai, W.-T. Wu, On convergence rate of the randomized Kaczmarz method, Linear Algebra Appl. 553 (2018), pp. 252–269

  12. [12]

    Z.-Z. Bai, L. Wang, On convergence rates of Kaczmarz-type methods with different selection rules of working rows, Appl. Numer. Math., 186 (2023), pp. 289–319. 29

  13. [13]

    Bai, W.-T

    Z.-Z. Bai, W.-T. Wu, On partially randomized extended Kaczmarz method for solving large sparse overdetermined inconsistent linear systems, Linear Algebra Appl., 578 (2019), pp. 225–250

  14. [14]

    Bai, W.-T

    Z.-Z. Bai, W.-T. Wu, On greedy randomized augmented Kaczmarz method for solving large sparse inconsistent linear systems, SIAM J. Sci. Comput., 43 (2021), pp. A3892–A3911

  15. [15]

    Zouzias, N.M

    A. Zouzias, N.M. Freris, Randomized extended Kaczmarz for solving least squares, SIAM J. Matrix Anal. Appl., 34 (2013), pp. 773–793

  16. [16]

    A. Ma, D. Needell, A. Ramdas, Convergence properties of the randomized extended Gauss-Seidel and Kaczmarz methods, SIAM J. Matrix Anal. Appl., 36 (2015), pp. 1590–1604

  17. [17]

    Dekel, R

    O. Dekel, R. Gilad-Bachrach, O. Shamir, et al., Optimal distributed online prediction using mini-batches, J. Mach. Learn. Res., 13(1) (2012), pp. 165–202

  18. [18]

    Shalev-Shwartz, T

    S. Shalev-Shwartz, T. Zhang, Accelerated mini-batch stochastic dual coordinate ascent, Adv. Neural Inf. Process. Syst., 26 (2013), pp. 1–8

  19. [19]

    Bai, X.-G

    Z.-Z. Bai, X.-G. Liu, On the Meany inequality with applications to convergence analysis of several row-action iteration methods, Numer. Math., 124 (2013), pp. 215–236

  20. [20]

    Needell, J

    D. Needell, J. Tropp, Paved with good intentions: Analysis of a randomized block Kaczmarz method, Linear Algebra Appl., 441 (2014), pp. 199–221

  21. [21]

    R. M. Gower, P. Richt´ arik, Randomized iterative methods for linear systems, SIAM J. Matrix Anal. Appl., 36(4) (2015), pp. 1660–1690

  22. [22]

    Necoara, Faster randomized block Kaczmarz algorithms, SIAM J

    I. Necoara, Faster randomized block Kaczmarz algorithms, SIAM J. Matrix Anal. Appl., 40 (2019), pp. 1425–1452

  23. [23]

    Chen, Z.-D

    J.-Q. Chen, Z.-D. Huang, On a fast deterministic block Kaczmarz method for solving large-scale linear systems, Numer. Algorithms, 89(3) (2022), pp. 1007–1029

  24. [24]

    Du, W.-T

    K. Du, W.-T. Si, X.-H. Sun, Randomized extended average block Kaczmarz for solving least squares, SIAM J. Sci. Comput., 42 (2020), pp. A3541–A3559

  25. [25]

    Tan, X.-P

    L.-Z. Tan, X.-P. Guo, M.-Y. Deng, et al., On the adaptive deterministic block Kaczmarz method with momentum for solving large-scale consistent linear systems, J. Comput. Appl. Math., 457 (2025), 116328

  26. [26]

    R. M. Gower, D. Molitor, J. Moorman, and D. Needell, On adaptive sketch-and-project for solving linear systems, SIAM J. Matrix Anal. Appl., 42 (2021), pp. 954–989

  27. [27]

    D. P. Woodruff,Sketching as a Tool for Numerical Linear Algebra, Now Foundations and Trends, 2014

  28. [28]

    P. G. Martinsson, J. A. Tropp, Randomized numerical linear algebra: Foundations and algorithms, Acta Numer., 29 (2020), pp. 403–572

  29. [29]

    Murray, J

    R. Murray, J. Demmel, M. W. Mahoney, et al., Randomized numerical linear algebra: A perspective on the field with an eye to software, arXiv preprint arXiv:2302.11474, 2023

  30. [30]

    Kireeva, J

    A. Kireeva, J. A. Tropp, Randomized matrix computations: Themes and variations, arXiv preprint arXiv:2402.17873, 2024

  31. [31]

    Derezi´ nski, E

    M. Derezi´ nski, E. Rebrova, Sharp analysis of sketch-and-project methods via a connection to randomized singular value decomposition, SIAM J. Math. Data Sci., 6 (2024), pp. 127–153

  32. [32]

    P. J. Denning, The locality principle, Commun. ACM, 48(7) (2005), pp. 19–24

  33. [33]

    T. A. Davis, Y. Hu, The university of Florida sparse matrix collection, ACM Trans Math Softw., 38(1) (2011), pp. 1–25

  34. [34]

    Vershynin,High-Dimensional Probability: An Introduction with Applications in Data Science, Cambridge Univer- sity Press, Cambridge, 2018

    R. Vershynin,High-Dimensional Probability: An Introduction with Applications in Data Science, Cambridge Univer- sity Press, Cambridge, 2018

  35. [35]

    Boucheron, G

    S. Boucheron, G. Lugosi, P. Massart,Concentration Inequalities: A Nonasymptotic Theory of Independence, Oxford University Press, Oxford, 2013