Iterative Refinement for Diagonalizable Non-Hermitian Eigendecompositions
Pith reviewed 2026-05-13 19:07 UTC · model grok-4.3
The pith
An iterative refinement method uses matrix multiplications to achieve quadratic residual reduction for approximate eigendecompositions of diagonalizable non-Hermitian matrices.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
For simple eigenvalues the right-only update is selected by a first-order derivation whose resulting residual identity is exact, delivering a quadratic residual bound; the left-right version treats the computable driving matrix as an exact perturbation of the inverse-based driver and shows that the biorthogonality correction obeys an exact Newton-Schulz error identity, producing a local second-order estimate whenever the initial biorthogonality error is small.
What carries the argument
The exact post-update residual identity obtained from the first-order derivation in the right-only regime, together with the Newton-Schulz-type error identity satisfied by the biorthogonality correction in the left-right regime.
If this is right
- The residual after a single right-only update satisfies an exact identity that guarantees quadratic reduction without remainder terms.
- The left-right W-method achieves local second-order convergence once biorthogonality error is sufficiently small.
- Cluster stabilization by re-diagonalization and intracluster suppression extends the method to groups of nearby eigenvalues while preserving the quadratic character outside the cluster.
- All updates require only matrix-matrix multiplications, so the iteration can be applied directly to existing decompositions without forming inverses.
Where Pith is reading between the lines
- The exact identities may let similar refinement steps be derived for related problems such as the singular value decomposition of non-normal matrices.
- Because the method is matrix-multiplication based it could be applied to large sparse matrices provided an initial accurate decomposition is available from any source.
- The unanalyzed attraction region implies that practical use still requires an initial solver whose accuracy is already near the quadratic basin.
- Numerical verification on controlled ill-conditioned clusters suggests the stabilization can be tested on random matrices with known spectrum to map its practical radius of convergence.
Load-bearing premise
The matrix is diagonalizable, the eigenvalues are simple or handled by cluster stabilization, and the initial approximation is already accurate enough to stay inside the unanalyzed attraction region.
What would settle it
Apply one iteration to a known diagonalizable matrix after adding a controlled small perturbation to an approximate right eigenvector and check whether the residual norm squared scales with the square of the initial perturbation size.
read the original abstract
This paper develops matrix-multiplication-based iterative refinement for diagonalizable non-Hermitian eigendecompositions. The main theory concerns simple eigenvalues and distinguishes two input regimes. In the right-only regime, where only approximate right eigenvectors and eigenvalues are available, a first-order derivation selects the update and the resulting post-update residual identity is exact, yielding a quadratic residual bound. In the left-right regime, where approximate left and right eigenvectors are both available, the computable driving matrix is an exact perturbation of the inverse-based one and the biorthogonality correction satisfies an exact Newton--Schulz-type error identity. Under a small biorthogonality error, these relations yield a local second-order estimate for the resulting $W$-method. Clustered eigenvalues are handled separately by a stabilization extension based on clusterwise re-diagonalization and suppression of intracluster corrections, whose effect is verified on controlled matrices with ill-conditioned cluster bases. The method is intended as post-processing for an already accurate eigendecomposition. The attraction region is not analyzed, and no complete theory is given for the clustered case.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. This paper develops matrix-multiplication-based iterative refinement for diagonalizable non-Hermitian eigendecompositions. The main theory concerns simple eigenvalues and distinguishes two input regimes. In the right-only regime, a first-order derivation selects the update and the resulting post-update residual identity is exact, yielding a quadratic residual bound. In the left-right regime, the biorthogonality correction satisfies an exact Newton-Schulz-type error identity that yields a local second-order estimate under small biorthogonality error. Clustered eigenvalues are handled by a stabilization extension based on clusterwise re-diagonalization and suppression of intracluster corrections, verified on controlled matrices.
Significance. If the results hold, the work provides an efficient post-processing technique for refining eigendecompositions of non-Hermitian matrices using only matrix multiplications. The exact post-update residual identities in the right-only regime and the Newton-Schulz-type error relations in the left-right regime are notable strengths that support the quadratic and local second-order claims under the stated assumptions. The parameter-free derivations from matrix algebra add rigor when the initial approximation is sufficiently accurate.
major comments (2)
- Abstract: the quadratic residual bound in the right-only regime and the local second-order estimate in the left-right regime are derived under the assumption that the current approximation already lies inside the (unanalyzed) attraction region; without a characterization or a-priori bound on the initial residual, these convergence claims remain conditional and local only.
- Abstract: the stabilization extension for clustered eigenvalues is presented with verification on controlled matrices but the manuscript states that no complete theory is given for this case; this is load-bearing for any claim that the method rigorously handles clusters.
minor comments (1)
- The abstract could explicitly state the number of matrix multiplications per iteration and the intended use as post-processing for already accurate decompositions to improve clarity for readers.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive report. The comments correctly identify that the convergence claims are local and that the cluster stabilization lacks complete theory; both points are already noted in the manuscript. We will revise the abstract to make these scopes explicit while preserving the technical contributions on the exact residual identities and Newton-Schulz-type relations.
read point-by-point responses
-
Referee: Abstract: the quadratic residual bound in the right-only regime and the local second-order estimate in the left-right regime are derived under the assumption that the current approximation already lies inside the (unanalyzed) attraction region; without a characterization or a-priori bound on the initial residual, these convergence claims remain conditional and local only.
Authors: We agree that the results are conditional on the initial approximation lying inside an attraction region whose size is not characterized. The manuscript already states that the method is intended as post-processing for an already accurate eigendecomposition and that the attraction region is not analyzed. We will revise the abstract to foreground this locality explicitly, e.g., by adding a clause such as “assuming the initial approximation is sufficiently accurate.” This change clarifies the scope without affecting the exact post-update residual identities or the Newton–Schulz-type error relations derived in the body. revision: yes
-
Referee: Abstract: the stabilization extension for clustered eigenvalues is presented with verification on controlled matrices but the manuscript states that no complete theory is given for this case; this is load-bearing for any claim that the method rigorously handles clusters.
Authors: We concur that the cluster stabilization is presented only with numerical verification on controlled matrices and that no complete theory is supplied. The manuscript already records this limitation. We will revise the abstract to describe the extension as “an empirical stabilization procedure verified on controlled matrices with ill-conditioned cluster bases” rather than implying rigorous cluster handling. Additional discussion of the heuristic nature and its limitations can be added to the main text if the referee considers it necessary. revision: yes
Circularity Check
Derivations rely on direct matrix algebra and exact residual identities with no reduction to self-referential inputs or fitted parameters.
full rationale
The paper's core claims rest on first-order expansions and exact post-update residual identities derived from the given matrix equations for diagonalizable non-Hermitian cases. These identities (right-only quadratic bound and left-right Newton-Schulz-type relation) follow directly from algebraic manipulation under the stated assumptions of simple eigenvalues and sufficient initial accuracy, without any parameter fitting, self-definitional loops, or load-bearing self-citations that reduce the result to its own inputs. The explicit statement that the attraction region is unanalyzed is a limitation on the global applicability of the local bounds, but it does not create circularity in the local derivation chain itself, which remains self-contained against the matrix algebra premises.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption The input matrix is diagonalizable.
- domain assumption Eigenvalues under consideration are simple or handled by cluster stabilization.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
In the right-only regime, the update selected by the first-order derivation is proved to satisfy an exact post-update residual identity and a quadratic residual bound.
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
the biorthogonality correction satisfies an exact Newton–Schulz-type error identity
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Clarendon Press, Oxford (1965)
Wilkinson, J.H.: The Algebraic Eigenvalue Problem. Clarendon Press, Oxford (1965)
work page 1965
-
[2]
Stewart, G.W., Sun, J.-g.: Matrix Perturbation Theory. Academic Press, Boston (1990)
work page 1990
-
[3]
Perturbation Theory for Linear Operators
Kato, T.: Perturbation Theory for Linear Operators, 2nd edn. Classics in Math- ematics. Springer, Berlin (1995). https://doi.org/10.1007/978-3-642-66282-9
-
[4]
Princeton University Press, Princeton (2005)
Trefethen, L.N., Embree, M.: Spectra and Pseudospectra: The Behavior of Non- normal Matrices and Operators. Princeton University Press, Princeton (2005)
work page 2005
-
[5]
(eds.): Templates for the Solution of Algebraic Eigenvalue Problems: A Practical Guide
Bai, Z., Demmel, J., Dongarra, J., Ruhe, A., Vorst, H.A. (eds.): Templates for the Solution of Algebraic Eigenvalue Problems: A Practical Guide. SIAM, Philadelphia (2000). https://doi.org/10.1137/1.9780898719581
-
[6]
Saad, Y.: Numerical Methods for Large Eigenvalue Problems, Revised edn. Clas- sics in Applied Mathematics, vol. 66. SIAM, Philadelphia (2011). https://doi. org/10.1137/1.9781611970739
-
[7]
Acta Numerica 11, 519–584 (2002) https://doi.org/10.1017/S0962492902000089
Sorensen, D.C.: Numerical methods for large eigenvalue problems. Acta Numerica 11, 519–584 (2002) https://doi.org/10.1017/S0962492902000089
-
[8]
Golub, G.H., Van Loan, C.F.: Matrix Computations, 4th edn. Johns Hopkins University Press, Baltimore (2013). https://doi.org/10.56021/9781421407944
-
[9]
Demmel, J.W.: Applied Numerical Linear Algebra. SIAM, Philadelphia (1997). https://doi.org/10.1137/1.9781611971446
-
[10]
SIAM Journal on Numerical Analysis20(1), 23–45 (1983)
Dongarra, J.J., Moler, C.B., Wilkinson, J.H.: Improving the accuracy of computed eigenvalues and eigenvectors. SIAM Journal on Numerical Analysis20(1), 23–45 (1983)
work page 1983
-
[11]
SIAM Journal on Matrix Analysis and Applications22(4), 1038–1057 (2001)
Tisseur, F.: Newton’s method in floating point arithmetic and iterative refine- ment of generalized eigenvalue problems. SIAM Journal on Matrix Analysis and Applications22(4), 1038–1057 (2001)
work page 2001
-
[12]
Linear Algebra and its Applications358, 145–172 (2003) https://doi
Hochstenbach, M.E., Sleijpen, G.L.G.: Two-sided and alternating jacobi– davidson. Linear Algebra and its Applications358, 145–172 (2003) https://doi. org/10.1016/S0024-3795(01)00494-3
-
[13]
Numerical Linear Algebra with Applications 22(1), 175–196 (2015) https://doi.org/10.1002/nla.1945 22
Freitag, M.A., K¨ urschner, P.: Tuned preconditioners for inexact two-sided inverse and rayleigh quotient iteration. Numerical Linear Algebra with Applications 22(1), 175–196 (2015) https://doi.org/10.1002/nla.1945 22
-
[14]
Japan Journal of Industrial and Applied Mathematics35(3), 1007–1035 (2018)
Ogita, T., Aishima, K.: Iterative refinement for symmetric eigenvalue decompo- sition. Japan Journal of Industrial and Applied Mathematics35(3), 1007–1035 (2018)
work page 2018
-
[15]
Japan Journal of Industrial and Applied Mathematics36(2), 435–459 (2019)
Ogita, T., Aishima, K.: Iterative refinement for symmetric eigenvalue decom- position ii: clustered eigenvalues. Japan Journal of Industrial and Applied Mathematics36(2), 435–459 (2019)
work page 2019
-
[16]
Ogita, T., Aishima, K.: Iterative refinement for singular value decomposi- tion based on matrix multiplication. Journal of Computational and Applied Mathematics369, 112512 (2020) https://doi.org/10.1016/j.cam.2019.112512
-
[17]
Numerical Algorithms95(2), 979–1009 (2024) https://doi
Uchino, Y., Terao, T., Ozaki, K.: Acceleration of iterative refinement for singular value decomposition. Numerical Algorithms95(2), 979–1009 (2024) https://doi. org/10.1007/s11075-023-01596-9
-
[18]
JSIAM Letters16, 89–92 (2024) https://doi.org/10
Terao, T., Imamura, T., Ozaki, K.: Iterative refinement for an eigenpair subset of a real symmetric matrix. JSIAM Letters16, 89–92 (2024) https://doi.org/10. 14495/jsiaml.16.89
work page 2024
-
[19]
https: //arxiv.org/abs/2602.23778
Terao, T., Ozaki, K., Imamura, T., Ogita, T.: Iterative Refinement for a Subset of Eigenvectors of Symmetric Matrices via Matrix Multiplications (2026). https: //arxiv.org/abs/2602.23778
-
[20]
https://arxiv.org/abs/2602.19090
Terao, T., Ozaki, K.: Forward Error-Oriented Iterative Refinement for Eigenvec- tors of a Real Symmetric Matrix (2026). https://arxiv.org/abs/2602.19090
-
[21]
Numerical Algorithms92(1), 247–267 (2023) https://doi.org/10.1007/ s11075-022-01327-6
Bujanovi´ c, Z., Kressner, D., Schr¨ oder, C.: Iterative refinement of schur decom- positions. Numerical Algorithms92(1), 247–267 (2023) https://doi.org/10.1007/ s11075-022-01327-6
work page 2023
-
[22]
Accuracy and Stability of Numerical Algorithms (2nd ed.)
Higham, N.J.: Accuracy and Stability of Numerical Algorithms, 2nd edn. SIAM, Philadelphia (2002). https://doi.org/10.1137/1.9780898718027
-
[23]
In: High Performance Computing and Grids in Action
Buttari, A., Dongarra, J., Kurzak, J., Langou, J., Langou, J., Luszczek, P., Tomov, S.: Exploiting mixed precision floating point hardware in scientific com- putations. In: High Performance Computing and Grids in Action. Advances in Parallel Computing, vol. 16, pp. 19–36. IOS Press, Amsterdam, The Netherlands (2008)
work page 2008
-
[24]
Carson, E., Higham, N.J.: Accelerating the solution of linear systems by iterative refinement in three precisions. SIAM Journal on Scientific Computing40(2), 817– 847 (2018) https://doi.org/10.1137/17M1140819
-
[25]
Haidar, A., Tomov, S., Dongarra, J., Higham, N.J.: Harnessing gpu tensor cores for fast fp16 arithmetic to speed up mixed-precision iterative refinement solvers. In: Proceedings of the International Conference for High Performance Computing, 23 Networking, Storage and Analysis, pp. 603–613 (2018). https://doi.org/10.1109/ SC.2018.00050
-
[26]
Acta Numerica31, 347–414 (2022) https://doi.org/10.1017/S0962492922000022
Higham, N.J., Mary, T.: Mixed precision algorithms in numerical linear algebra. Acta Numerica31, 347–414 (2022) https://doi.org/10.1017/S0962492922000022
-
[27]
Numerical Algorithms59(1), 95–118 (2012)
Ozaki, K., Ogita, T., Oishi, S., Rump, S.M.: Error-free transformations of matrix multiplication by using fast routines of matrix multiplication and its applications. Numerical Algorithms59(1), 95–118 (2012)
work page 2012
-
[28]
Ozaki, K., Uchino, Y., Imamura, T.: Ozaki scheme ii: A gemm-oriented emulation of floating-point matrix multiplication using an integer modular technique. arXiv preprint arXiv:2504.08009 (2025)
-
[29]
The University of Florida sparse matrix collection,
Davis, T.A., Hu, Y.: The university of florida sparse matrix collection. ACM Transactions on Mathematical Software38(1), 1–1125 (2011) https://doi.org/10. 1145/2049662.2049663 24
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.