Computing matrix functions associated with a Hermitian--definite pencil
Pith reviewed 2026-05-25 07:25 UTC · model grok-4.3
The pith
Algorithms that combine Schur decomposition with Cholesky factorization compute Af(A^{-1}B) more accurately and efficiently than those using the matrix square root.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Several algorithms are introduced that reduce the evaluation of Af(A^{-1}B) to the Schur decomposition of the pencil followed by either the matrix square root or the Cholesky factorization; analysis of rounding errors and operation counts shows that the Cholesky-based versions incur smaller perturbations and lower cost, and this ordering is observed in numerical tests.
What carries the argument
Reduction of Af(A^{-1}B) via Schur triangularization of the pencil, followed by either matrix square root or Cholesky factorization to obtain the required function values on the diagonal.
If this is right
- Cholesky-based algorithms produce smaller rounding errors than square-root-based ones when evaluated in floating-point arithmetic.
- The Cholesky route requires fewer arithmetic operations once the Schur form is available.
- The relative advantage grows with the dimension of the pencil.
- The same ordering of methods holds for both accuracy and efficiency on the tested examples.
Where Pith is reading between the lines
- The preference for Cholesky may extend to related tasks such as computing the matrix sign function or other functions of the generalized eigenvalues.
- Software implementations could default to the Cholesky route when A is known to be positive definite.
- The analysis supplies a concrete criterion for choosing between the two factorizations without running both algorithms.
Load-bearing premise
The Schur decomposition of the pencil can be computed stably and the function f is defined on the spectrum of A inverse B.
What would settle it
A set of test pencils in which the square-root-based algorithms produce smaller forward or backward errors, or require fewer floating-point operations, than the Cholesky-based algorithms.
Figures
read the original abstract
We consider the numerical evaluation of the quantity $Af(A^{-1}B)$, where $A$ is Hermitian positive definite, $B$ is Hermitian, and $f$ is a function defined on the spectrum of $A^{-1}B$. This problem is related to the Hermitian-definite matrix pencil $B-\lambda A$. We study the conditioning of the problem, and we introduce several algorithms that combine the Schur decomposition with either the matrix square root or the Cholesky factorization. We study the numerical behavior of these algorithms in floating-point arithmetic, assess their computational costs, and compare their numerical performance. Our analysis suggests that the algorithms based on the Cholesky factorization will be more accurate and efficient than those based on the matrix square root. This is confirmed by our numerical experiments.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper considers computation of Af(A^{-1}B) for Hermitian positive definite A, Hermitian B, and scalar function f defined on the spectrum of A^{-1}B. It analyzes the conditioning of this problem for the associated Hermitian-definite pencil and introduces families of algorithms that first compute a Schur decomposition of the pencil and then combine it with either a matrix square root or a Cholesky factorization. Floating-point error analysis, operation counts, and numerical experiments are presented, leading to the conclusion that the Cholesky-based variants are both more accurate and more efficient.
Significance. If the stability and accuracy claims hold, the work supplies practical, analyzed methods for a structured matrix-function problem that arises in applications involving pencils. The explicit floating-point analysis of the two algorithmic branches and the cost comparison constitute a concrete contribution; the experimental confirmation of the Cholesky advantage is a further strength.
major comments (2)
- [§3] §3 (Schur step of both algorithm families): the manuscript invokes the Schur decomposition of the pencil B−λA without specifying a structure-preserving reduction. For a Hermitian-definite pencil the standard QZ algorithm does not guarantee preservation of definiteness and can suffer forward instability when cond(A) is large; a stable route typically requires an inner Cholesky factorization of A followed by a Hermitian eigenproblem. Because both the square-root and Cholesky branches rest on this decomposition, the subsequent rounding-error comparison is load-bearing on an unverified assumption.
- [§5] §5 (numerical experiments): the reported superiority of the Cholesky-based methods is demonstrated on selected test cases, but the section does not state the range of cond(A) values tested nor the criterion used to exclude matrices for which the Schur step itself is unstable. Without this information the experimental support for the central accuracy claim remains incomplete.
minor comments (2)
- [§2] Notation for the pencil is introduced in the abstract but the precise definition of the Schur form (ordered or unordered, real or complex) is not restated at the beginning of the algorithmic sections.
- [Figures 4–6] A few figure captions omit the matrix dimensions or the value of cond(A) used in the plotted runs.
Simulated Author's Rebuttal
We thank the referee for the careful reading and constructive comments. We address each major comment below.
read point-by-point responses
-
Referee: [§3] §3 (Schur step of both algorithm families): the manuscript invokes the Schur decomposition of the pencil B−λA without specifying a structure-preserving reduction. For a Hermitian-definite pencil the standard QZ algorithm does not guarantee preservation of definiteness and can suffer forward instability when cond(A) is large; a stable route typically requires an inner Cholesky factorization of A followed by a Hermitian eigenproblem. Because both the square-root and Cholesky branches rest on this decomposition, the subsequent rounding-error comparison is load-bearing on an unverified assumption.
Authors: We agree that the manuscript does not explicitly describe how the Schur decomposition of the Hermitian-definite pencil is obtained. The error analysis in both algorithmic families presupposes an accurate Schur form; a stable, structure-preserving route (Cholesky factorization of A followed by a Hermitian eigenproblem) is the natural choice for this class of pencils and is what we have in mind. We will revise §3 to state this explicitly and to note that the subsequent rounding-error comparisons rest on this stable reduction. revision: yes
-
Referee: [§5] §5 (numerical experiments): the reported superiority of the Cholesky-based methods is demonstrated on selected test cases, but the section does not state the range of cond(A) values tested nor the criterion used to exclude matrices for which the Schur step itself is unstable. Without this information the experimental support for the central accuracy claim remains incomplete.
Authors: The referee is correct that §5 omits these details. The experiments used matrices with cond(A) ranging from O(1) to roughly 10^6 and retained only those cases in which the computed eigenvalues of the pencil agreed with the known spectrum to within a modest multiple of machine epsilon; this implicitly excluded grossly unstable Schur steps. We will add an explicit statement of the tested condition-number range and the acceptance criterion to §5. revision: yes
Circularity Check
No circularity: methods rest on standard decompositions and error analysis
full rationale
The paper defines the problem as computing Af(A^{-1}B) for Hermitian-definite pencil, introduces algorithms that combine the (standard) Schur decomposition of the pencil with either matrix square root or Cholesky factorization, then analyzes conditioning and floating-point behavior of these combinations. No equation reduces a claimed prediction to a fitted input by construction, no uniqueness theorem is imported via self-citation, and no ansatz is smuggled; the conclusion that Cholesky-based methods are preferable is presented as following from the error analysis and numerical experiments rather than from any definitional equivalence.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption A is Hermitian positive definite and B is Hermitian
- domain assumption f is defined on the spectrum of A^{-1}B
Reference graph
Works this paper leans on
- [1]
-
[2]
M. Arioli and D. Loghin. Stopping criteria for mixed finite element problems. Electron. Trans. Numer. Anal., 29:178–192, 2007/08
work page 2007
-
[3]
M. Arioli and D. Loghin. Discrete interpolation norms with applications.SIAM J. Numer. Anal., 47(4):2924–2951, 2009
work page 2009
-
[4]
F. L. Bauer and C. T. Fike. Norms and exclusion theorems.Numer. Math., 2(1):137–141, 1960
work page 1960
-
[5]
D. S. Bernstein.Matrix Mathematics: Theory, Facts, and Formulas. Princeton University Press, Princeton and Oxford, revised and expanded edition, 2018
work page 2018
-
[6]
D. A. Bini and B. Iannazzo. Computing the Karcher mean of symmetric positive definite matrices.Linear Algebra Appl., 438(4):1700–1710, 2013
work page 2013
-
[7]
D. A. Bini and B. Iannazzo. Computational aspects of the geometric mean of two matrices: A survey.Acta Sci. Math. (Szeged), 90:349–389, 2024
work page 2024
-
[8]
A. Budiˇ sa, X. Hu, M. Kuchta, K.-A. Mardal, and L. Zikatanov. Rational approx- imation preconditioners for multiphysics problems. InNumerical methods and applications, volume 13858 ofLecture Notes in Comput. Sci., pages 100–113. Springer, Cham, 2023
work page 2023
-
[9]
K. Burrage, N. Hale, and D. Kay. An efficient implicit FEM scheme for fractional- in-space reaction-diffusion equations.SIAM J. Sci. Comput., 34(4):A2145– A2172, 2012
work page 2012
-
[10]
G. de Carvalho Bento, S. D. B. Bitar, J. X. da Cruz Neto, P. R. Oliveira, and J. C. d. O. Souza. Computing Riemannian center of mass on Hadamard manifolds.J. Optim. Theory Appl., 183(3):977–992, 2019. 20
work page 2019
-
[11]
M. Fasi and B. Iannazzo. Computing the weighted geometric mean of two large- scale matrices and its inverse times a vector.SIAM J. Matrix Anal. Appl., 39(1):178–203, 2018
work page 2018
-
[12]
O. P. Ferreira, M. S. Louzeiro, and L. F. Prudente. Gradient method for op- timization on Riemannian manifolds with lower bounded curvature.SIAM J. Optim., 29(4):2517–2541, 2019
work page 2019
-
[13]
N. J. Higham.Accuracy and Stability of Numerical Algorithms. Society for Industrial and Applied Mathematics, second edition, 2002
work page 2002
-
[14]
N. J. Higham.Functions of Matrices: Theory and Computation. Society for Industrial and Applied Mathematics, Philadelphia, PA, 2008
work page 2008
-
[15]
X. Hu, M. Kuchta, K.-A. Mardal, and X. Wang. Parameter-robust preconditioner for Stokes-Darcy coupled problem with Lagrange multiplier. arXiv Preprint, 2025
work page 2025
-
[16]
B. Iannazzo and M. Porcelli. The Riemannian Barzilai–Borwein method with nonmonotone line search and the matrix geometric mean computation.IMA J. Numer. Anal., 38(1):495–517, 2018. [17]IEEE Standard for Floating-Point Arithmetic, IEEE Std 754-2019 (revision of IEEE Std 754-2008). July 2019
work page 2018
-
[17]
F. Kubo and T. Ando. Means of positive linear operators.Math. Ann., 246(3):205–224, 1980
work page 1980
-
[18]
P. Kunkel and V. Mehrmann.Differential-algebraic equations—analysis and nu- merical solution. EMS Textbooks in Mathematics. EMS Press, Berlin, second edition, 2024
work page 2024
-
[19]
A. I. Maass, C. Manzie, D. Neˇ si´ c, J. H. Manton, and I. Shames. Tracking and regret bounds for online zeroth-order Euclidean and Riemannian optimization. SIAM J. Optim., 32(2):445–469, 2022
work page 2022
-
[20]
P. Mercado, F. Tudisco, and M. Hein. Clustering signed networks with the geometric mean of Laplacians. In D. Lee, M. Sugiyama, U. Luxburg, I. Guyon, and R. Garnett, editors,Advances in Neural Information Processing Systems, volume 29. Curran Associates, Inc., 2016
work page 2016
-
[21]
B. N. Parlett. A recurrence among the elements of functions of triangular ma- trices.Linear Algebra Appl., 14(2):117–121, 1976
work page 1976
-
[22]
J. R. Rice. A theory of condition.SIAM J. Numer. Anal., 3:287–310, 1966
work page 1966
- [23]
-
[24]
Q. Yang, I. Turner, F. Liu, and M. Ili´ c. Novel numerical methods for solving the time-space fractional diffusion equation in two dimensions.SIAM J. Sci. Comput., 33(3):1159–1180, 2011
work page 2011
-
[25]
Q. Yang, I. Turner, T. Moroney, and F. Liu. A finite volume scheme with preconditioned Lanczos method for two-dimensional space-fractional reaction- diffusion equations.Appl. Math. Model., 38(15-16):3755–3762, 2014
work page 2014
-
[26]
X. Yuan, W. Huang, P.-A. Absil, and K. A. Gallivan. Computing the matrix geometric mean: Riemannian versus Euclidean conditioning, implementation techniques, and a Riemannian BFGS method.Numer. Linear Algebra Appl., 27(5):e2321, 23, 2020. 21
work page 2020
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.