pith. sign in

arxiv: 1907.05666 · v1 · pith:RLPV6TUSnew · submitted 2019-07-12 · 🧮 math.NA · cs.NA

Adaptive Regularization Parameter Choice Rules for Large-Scale Problems

Pith reviewed 2026-05-24 22:40 UTC · model grok-4.3

classification 🧮 math.NA cs.NA
keywords regularization parameter choiceTikhonov regularizationKrylov subspacesGolub-Kahan bidiagonalizationbilevel optimizationdiscrepancy principlelarge-scale inverse problemsGauss quadrature
0
0 comments X

The pith

New strategies interlace iterations to simultaneously tune the Tikhonov parameter and Krylov subspace dimension for large inverse problems.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops new adaptive strategies for choosing regularization parameters when applying Tikhonov regularization to large-scale linear inverse problems projected onto Krylov subspaces of growing dimension. It frames the task as a bilevel optimization problem, with the lower level handling the projected Tikhonov solve and the higher level applying a parameter choice rule. These levels are solved together by interlacing their iterations rather than running them sequentially. Standard rules including the discrepancy principle, GCV, quasi-optimality, and Reginska criteria are adapted to this setting. Links between Golub-Kahan bidiagonalization and Gauss quadrature are used to prove convergence for the discrepancy principle, and numerical tests on imaging problems indicate the resulting methods are reliable while being simpler and less expensive than prior approaches.

Core claim

The paper derives a new class of adaptive regularization parameter choice strategies that simultaneously set the Tikhonov parameter and the dimension of the projection subspace by interlacing the iterations performed to project the Tikhonov problem with those performed to apply a given parameter choice rule, regarding these as special instances of bilevel optimization methods; the discrepancy principle, GCV, quasi-optimality criterion, and Reginska criterion are all adapted within this framework, with convergence results for the discrepancy principle proved via the links between Gauss quadrature and Golub-Kahan bidiagonalization.

What carries the argument

Bilevel optimization solved by interlacing lower-level Tikhonov projection onto Krylov subspaces via Golub-Kahan bidiagonalization with higher-level parameter choice rules.

If this is right

  • The discrepancy principle, GCV, quasi-optimality criterion, and Reginska criterion can all be adapted to the interlaced bilevel framework.
  • Convergence results for the adapted discrepancy principle follow from the established connection between Gauss quadrature and Golub-Kahan bidiagonalization.
  • The resulting regularization methods are reliable for large-scale problems and intrinsically simpler and cheaper than other strategies in the literature.
  • Numerical tests modeling inverse problems in imaging confirm the practical effectiveness of the new strategies.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The interlacing idea could be applied to regularization methods other than Tikhonov to achieve similar simultaneous tuning of multiple parameters.
  • Efficiency gains from avoiding separate iteration loops might make these strategies viable for real-time or very high-dimensional inverse problems where sequential tuning is prohibitive.
  • The quadrature connection exploited for convergence might be extended to give theoretical insight into the behavior of the GCV and quasi-optimality rules in this setting.

Load-bearing premise

Interlacing the iterations for projecting the Tikhonov problem with those for applying the parameter choice rule produces accurate simultaneous tuning of both the parameter and the subspace dimension without introducing extra errors or excessive cost.

What would settle it

A direct comparison on the same large-scale test problem between the interlaced method and a non-interlaced sequential method that shows the two select materially different parameters or subspace dimensions and produce reconstructions with substantially different error norms.

Figures

Figures reproduced from arXiv: 1907.05666 by Malena Sabate Landman, Silvia Gazzola.

Figure 1
Figure 1. Figure 1: Geometrical interpretation of the modified Newton method (27). [PITH_FULL_IMAGE:figures/full_fig_p010_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Values of the function kb − Ax(β)k 2 = fbDP + ε 2 in (24), lower bounds Gk + ε 2 (29), and upper bounds Rk+1 + ε 2 (31) for k = 2, 5, 8, 30, versus β, for the problem in (19); values displayed in logarithmic scale. Stopping criteria for Algorithm 3. Since the modified Newton method (30) can essentially be regarded as a Newton-like update formula applied to a sequence of iteration-dependent converging funct… view at source ↗
Figure 3
Figure 3. Figure 3: Note that, even after 225 iterations, the bound [PITH_FULL_IMAGE:figures/full_fig_p014_3.png] view at source ↗
Figure 3
Figure 3. Figure 3: Values of the GCV functional (37) and upper bounds (39) for [PITH_FULL_IMAGE:figures/full_fig_p015_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Values of the quasi optimality functional (40), and some of its lower bounds (42) and [PITH_FULL_IMAGE:figures/full_fig_p016_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Values of the Regi´nska’s functional (43), and some of its lower bounds (45) and [PITH_FULL_IMAGE:figures/full_fig_p017_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Example 1. First row: values of the regularization parameter [PITH_FULL_IMAGE:figures/full_fig_p020_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Higher-level surface for Example 1. 50 logarithmically equispaced values of α between 10−6 and 1 and increasing values of k up to 150 are considered. (a) Discrepancy principle (i.e., kb−Axk(α)k 2 ); (b) GCV (i.e., upper bounds Pk(α) in (39)); (c) quasi-optimality criterion (i.e., upper bounds Pk(α) in (41)); (d) Regi´nska criterion (i.e., upper bounds Pk(α) in (44)). The green stars and black circles highl… view at source ↗
Figure 8
Figure 8. Figure 8: Error surface for Example 1. 50 logarithmically equispaced values of α between 10−6 and 1 and k = 1, ..., 150 are sampled. The markers highlight the combinations of the sampled values of α and k leading the minimal relative reconstruction errors, for all k = 1, ..., 150. Example 2 A test problem modelling X-ray tomography involving the Shepp-Logan phantom of size 256 × 256 pixels, acquired through a parall… view at source ↗
Figure 9
Figure 9. Figure 9: Example 2. First row: values of the regularization parameter [PITH_FULL_IMAGE:figures/full_fig_p022_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Higher-level surface for Example 2. 50 logarithmically equispaced values of α between 10−4 and 104 and k = 1, ..., 100 are considered. (a) Discrepancy principle (i.e., kb − Axk(α)k 2 ); (b) GCV (i.e., upper bounds Pk(α) in (39)); (c) quasi-optimality criterion (i.e., upper bounds Pk(α) in (41)); (d) Regi´nska criterion (i.e., upper bounds Pk(α) in (44)). The green stars and black circles highlight the qua… view at source ↗
Figure 11
Figure 11. Figure 11: Error surface for Example 2. 50 logarithmically equispaced values of α between 10−4 and 104 and k = 1, ..., 100 are sampled. The markers highlight the combinations of the sampled values of α and k leading the minimal relative reconstruction errors, for all k = 1, ..., 100. [7] J. Chung, J. G. Nagy, and D. P. O’Leary. A weighted-GCV method for Lanczos-hybrid regularization. Electron. Trans. Numer. Anal., 2… view at source ↗
read the original abstract

This paper derives a new class of adaptive regularization parameter choice strategies that can be effectively and efficiently applied when regularizing large-scale linear inverse problems by combining standard Tikhonov regularization and projection onto Krylov subspaces of increasing dimension (computed by the Golub-Kahan bidiagonalization algorithm). The success of this regularization approach heavily depends on the accurate tuning of two parameters (namely, the Tikhonov parameter and the dimension of the projection subspace): these are simultaneously set using new strategies that can be regarded as special instances of bilevel optimization methods, which are solved by using a new paradigm that interlaces the iterations performed to project the Tikhonov problem (lower-level problem) with those performed to apply a given parameter choice rule (higher-level problem). The discrepancy principle, the GCV, the quasi-optimality criterion, and Regi\'{n}ska criterion can all be adapted to work in this framework. The links between Gauss quadrature and Golub-Kahan bidiagonalization are exploited to prove convergence results for the discrepancy principle, and to give insight into the behavior of the other considered regularization parameter choice rules. Several numerical tests modeling inverse problems in imaging show that the new parameter choice strategies lead to regularization methods that are reliable, and intrinsically simpler and cheaper than other strategies already available in the literature.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper claims to derive a new class of adaptive regularization parameter choice strategies for large-scale linear inverse problems by combining Tikhonov regularization with projection onto Krylov subspaces of increasing dimension via the Golub-Kahan bidiagonalization algorithm. The Tikhonov parameter and subspace dimension are tuned simultaneously through a bilevel optimization paradigm that interlaces lower-level projection iterations with higher-level iterations of parameter choice rules (discrepancy principle, GCV, quasi-optimality criterion, and Regińska criterion). Gauss quadrature links are exploited to prove convergence for the discrepancy principle and provide insight into the others; numerical tests on imaging inverse problems are used to demonstrate that the resulting methods are reliable, simpler, and cheaper than existing alternatives.

Significance. If the interlacing construction can be shown to produce accurate simultaneous tuning without bias from partial projections, the work would provide a practically useful and computationally efficient framework for regularization of large-scale problems. The explicit use of Gauss quadrature connections for rigorous convergence results on at least the discrepancy principle is a methodological strength, as is the attempt to adapt multiple standard rules within a unified iterative scheme. The numerical evidence, if reproducible, would support claims of reliability in imaging applications.

major comments (3)
  1. [Abstract / interlacing construction] Abstract and the section describing the interlacing paradigm: the central claim that interlacing Golub-Kahan iterations with the higher-level parameter choice rules produces accurate tuning without introducing additional errors lacks explicit perturbation bounds on the discrepancy, GCV, quasi-optimality, or Regińska functionals when evaluated on the rank-k bidiagonal matrix rather than the full operator. No quantitative estimate is given for how early termination of the inner iteration affects the outer fixed-point or minimization step, which directly bears on the reliability of the bilevel approach.
  2. [Convergence analysis] Convergence analysis section: rigorous convergence is established only for the discrepancy principle via Gauss quadrature; for GCV, quasi-optimality, and Regińska the argument is described as providing only “insight.” This distinction is load-bearing because the abstract presents all four rules as successfully adapted within the same framework, yet the absence of comparable error analysis for the non-discrepancy rules leaves the generality of the convergence claims unsupported.
  3. [Numerical experiments] Numerical experiments section: the claim that the new strategies are “intrinsically simpler and cheaper than other strategies already available in the literature” is not supported by direct runtime or flop-count comparisons against specific competing bilevel or hybrid methods; the reported tests show qualitative success but do not quantify the cost savings that would justify the efficiency assertion.
minor comments (2)
  1. [Preliminaries] Notation for the projected Tikhonov problem and the bidiagonal matrices should be introduced with explicit definitions of all symbols (e.g., the relation between the full operator A and the k-step bidiagonal matrix B_k) to improve readability for readers unfamiliar with Golub-Kahan details.
  2. [Numerical experiments] Figure captions for the imaging test results should include the specific values of the regularization parameter and subspace dimension selected by each rule, rather than only visual reconstructions, to allow direct assessment of the adaptive choices.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the thorough review and constructive feedback on our manuscript. We address each major comment in detail below, indicating where revisions will be made to strengthen the presentation.

read point-by-point responses
  1. Referee: [Abstract / interlacing construction] Abstract and the section describing the interlacing paradigm: the central claim that interlacing Golub-Kahan iterations with the higher-level parameter choice rules produces accurate tuning without introducing additional errors lacks explicit perturbation bounds on the discrepancy, GCV, quasi-optimality, or Regińska functionals when evaluated on the rank-k bidiagonal matrix rather than the full operator. No quantitative estimate is given for how early termination of the inner iteration affects the outer fixed-point or minimization step, which directly bears on the reliability of the bilevel approach.

    Authors: We agree that the manuscript would benefit from a more quantitative discussion of the perturbation effects induced by early termination of the inner Golub-Kahan iterations. While the existing Gauss-quadrature analysis already provides some control on the functionals, we will add a dedicated remark (or short subsection) deriving first-order perturbation bounds for the discrepancy functional and qualitative estimates for the remaining rules, thereby clarifying the reliability of the interlacing construction. revision: yes

  2. Referee: [Convergence analysis] Convergence analysis section: rigorous convergence is established only for the discrepancy principle via Gauss quadrature; for GCV, quasi-optimality, and Regińska the argument is described as providing only “insight.” This distinction is load-bearing because the abstract presents all four rules as successfully adapted within the same framework, yet the absence of comparable error analysis for the non-discrepancy rules leaves the generality of the convergence claims unsupported.

    Authors: The abstract already distinguishes the results precisely: convergence is proved for the discrepancy principle while only insight is provided for the other three rules. This accurately reflects the manuscript content. The bilevel framework computationally adapts all four rules within the same interlacing scheme; the theoretical guarantees are strongest for the discrepancy principle, as stated. No uniform convergence claim is made for the non-discrepancy rules. revision: no

  3. Referee: [Numerical experiments] Numerical experiments section: the claim that the new strategies are “intrinsically simpler and cheaper than other strategies already available in the literature” is not supported by direct runtime or flop-count comparisons against specific competing bilevel or hybrid methods; the reported tests show qualitative success but do not quantify the cost savings that would justify the efficiency assertion.

    Authors: We accept that the efficiency claim would be better supported by explicit comparisons. In the revised manuscript we will include flop-count estimates and representative timing results for the proposed interlaced methods versus standard hybrid projection-regularization approaches, thereby quantifying the claimed savings. revision: yes

Circularity Check

0 steps flagged

No load-bearing circularity; derivations rely on external Gauss quadrature links

full rationale

The paper adapts standard parameter-choice rules (discrepancy, GCV, quasi-optimality, Regińska) to an interlaced bilevel framework using Golub-Kahan bidiagonalization. Convergence for the discrepancy principle is proved via established Gauss-quadrature connections to the bidiagonalization process; other rules receive only qualitative insight. No quoted equations reduce a claimed prediction or uniqueness result to a fitted input or self-citation by construction. The approach is presented as a computational reorganization of known methods rather than a self-referential derivation. This is the expected honest outcome for a paper whose proofs rest on external mathematical facts.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Based solely on the abstract; no explicit free parameters, axioms, or invented entities are detailed beyond standard regularization assumptions.

axioms (1)
  • domain assumption Standard assumptions of Tikhonov regularization and Krylov subspace methods hold for the projected problems.
    Implicit in the framework described in the abstract.

pith-pipeline@v0.9.0 · 5762 in / 1241 out tokens · 46897 ms · 2026-05-24T22:40:16.716552+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

31 extracted references · 31 canonical work pages

  1. [1]

    Bj¨ orck.Numerical Methods in Matrix Computations

    ˚A. Bj¨ orck.Numerical Methods in Matrix Computations . Springer, Switzerland, 2015

  2. [2]

    Bj¨ orck, E

    ˚A. Bj¨ orck, E. Grimme, and P. van Dooren. An implicit shift bidiagonalization algorithm for ill-posed systems. BIT, 34(4):510–534, 1994

  3. [3]

    Calvetti, G

    D. Calvetti, G. H. Golub, and L. Reichel. Estimation of the L-curve via Lanczos bidiago- nalization. BIT, 39(4):603–619, Dec 1999

  4. [4]

    Calvetti, L

    D. Calvetti, L. Reichel, and A. Shuibi. L-curve and curvature bounds for Tikhonov regu- larization. Numer. Algorithms, 35(2):301–314, Apr 2004

  5. [5]

    Chung and S

    J. Chung and S. Gazzola. Flexible Krylov methods for ℓp regularization. to appear, 2019

  6. [6]

    Chung, M

    J. Chung, M. E. Kilmer, and D. P. O’Leary. A framework for regularization via operator approximation. SIAM J. Sci. Comput. , 37(2):B332–59, 2015. 23 (a) (b) Figure 11: Error surface for Example 2. 50 logarithmically equispaced values of α between 10−4 and 104 and k = 1, ...,100 are sampled. The markers highlight the combinations of the sampled values of α...

  7. [7]

    Chung, J

    J. Chung, J. G. Nagy, and D. P. O’Leary. A weighted-GCV method for Lanczos-hybrid regularization. Electron. Trans. Numer. Anal., 28:149–167, 2008

  8. [8]

    C. Fenu, L. Reichel, and G. Rodriguez. GCV for Tikhonov regularization via global Golub– Kahan decomposition. Numer. Linear Algebra Appl. , 23, 02 2016

  9. [9]

    Frommer and P

    A. Frommer and P. Maass. Fast CG-based methods for Tikhonov-Phillips regularization. SIAM J. Sci. Comput. , 20(6):1831–1850, 1999

  10. [10]

    Gazzola, P.C

    S. Gazzola, P.C. Hansen, and J.G. Nagy. IR Tools: a MATLAB package of iterative regularization methods and large-scale test problems. Numer. Algorithms, 2018

  11. [11]

    Gazzola and P

    S. Gazzola and P. Novati. Automatic parameter setting for Arnoldi-Tikhonov methods. J. Comput. Appl. Math. , 256:180–195, 2014

  12. [12]

    Gazzola, P

    S. Gazzola, P. Novati, and M. R. Russo. On Krylov projection methods and Tikhonov regularization. Electron. Trans. Numer. Anal., 44:83–123, 2015

  13. [13]

    Golub and V

    G. Golub and V. Pereyra. Separable nonlinear least squares: the variable projection method and its applications. Inverse Problems, 19(2):R1, 2003

  14. [14]

    G. H. Golub and U. Von Matt. Generalized cross-validation for large-scale problems. J. Comput. Graph. Statist. , 6:1–34, 1997

  15. [15]

    G. H. Golub and G. Meurant. Matrices, moments, and quadrature with applications . Princeton University Press, Princeton, NJ, 2010

  16. [16]

    P. C. Hansen. Discrete Inverse Problems: Insight and Algorithms . Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 2010

  17. [17]

    Hnˇ etynkov´ a, M

    I. Hnˇ etynkov´ a, M. Pleˇ singer, and Z. Strakoˇ s. The regularizing effect of the Golub-Kahan iterative bidiagonalization and revealing the noise level in the data. BIT, 49(4):669–696, 2009

  18. [18]

    B. Hofmann. Regularization of applied inverse and ill-posed problems . Teubner, Leipzig, 1986

  19. [19]

    Kilmer and D

    M. Kilmer and D. O’Leary. Choosing regularization parameters in iterative methods for ill-posed problems. SIAM J. Matrix Anal. Appl. , 22(4):1204–1221, 2001. 24

  20. [20]

    Kunisch and T

    K. Kunisch and T. Pock. A bilevel optimization approach for parameter learning in vari- ational models. SIAM J. Imaging Sci. , 6(2):938–983, 2013

  21. [21]

    L´ opez Lagomasino, L

    G. L´ opez Lagomasino, L. Reichel, and L. Wunderlich. Matrices, moments, and rational quadrature. Linear Algebra Appl., 429(10):2540–2554, 2008

  22. [22]

    V. A. Morozov. On the solution of functional equations by the method of regularization. Soviet Math. Dokl. , 7:414–417, 2008

  23. [23]

    Novati and M

    P. Novati and M. R. Russo. A GCV-based Arnoldi-Tikhonov regularization method. BIT, 54:501–521, 2014

  24. [24]

    D. P. O’Leary and J. A. Simmons. A bidiagonalization-regularization procedure for large scale discretizations of ill-posed problems. SIAM J. Sci. Statist. Comput. , 2(4):474–489, 1981

  25. [25]

    Regi´ nska

    T. Regi´ nska. A regularization parameter in discrete ill-posed problems. SIAM J. Sci. Comput., 17:740–749, 1996

  26. [26]

    Reichel and A

    L. Reichel and A. Shyshkov. A new zero-finder for Tikhonov regularization. BIT, 48:627– 643, 2008

  27. [27]

    R. A. Renaut, M. Horst, Y. Wang, D. Cochran, and J. Hansen. Efficient estimation of regularization parameters via downsampling and the singular value expansion. BIT, 57(2):499–529, 2017

  28. [28]

    R. A. Renaut, S. Vatankhah, and V. E. Ardestani. Hybrid and iteratively reweighted regularization by unbiased predictive risk and weighted gcv for projected systems. SIAM J. Sci. Comput. , 39(2):B221–B243, 2017

  29. [29]

    A. K. Saibaba, A. Alexanderian, and I. C. F. Ipsen. Randomized Matrix-Free Trace and Log-Determinant Estimators. Numer. Math., 137(5):353–395, 2017

  30. [30]

    G. Wahba. Practical approximate solutions to linear operator equations when the data are noisy. SIAM J. Numer. Anal. , 14(4):651–667, 1977

  31. [31]

    Zama and E

    F. Zama and E. Loli Piccolomini. A descent method for regularization of ill-posed problems. Optim. Methods Softw. , 20(4–5):615–625, 2005. 25