Computing matrix functions associated with a Hermitian--definite pencil

Bruno Iannazzo; Dario A. Bini; Massimiliano Fasi

arxiv: 2510.16473 · v2 · pith:3J5WXYNJnew · submitted 2025-10-18 · 🧮 math.NA · cs.NA

Computing matrix functions associated with a Hermitian--definite pencil

Dario A. Bini , Massimiliano Fasi , Bruno Iannazzo This is my paper

Pith reviewed 2026-05-25 07:25 UTC · model grok-4.3

classification 🧮 math.NA cs.NA

keywords matrix functionsHermitian-definite pencilCholesky factorizationmatrix square rootSchur decompositionnumerical stabilitygeneralized eigenvalue problem

0 comments

The pith

Algorithms that combine Schur decomposition with Cholesky factorization compute Af(A^{-1}B) more accurately and efficiently than those using the matrix square root.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines the numerical computation of Af(A^{-1}B) for Hermitian positive definite A, Hermitian B, and a function f defined on the spectrum of A^{-1}B. This quantity is tied to the Hermitian-definite pencil B minus lambda A. The authors analyze the conditioning of the problem and present algorithms that start from the Schur form of the pencil and then apply either a matrix square root or a Cholesky factorization. They compare floating-point behavior and operation counts, concluding that the Cholesky route is preferable on both accuracy and speed. Experiments on random and structured pencils confirm the predicted advantage.

Core claim

Several algorithms are introduced that reduce the evaluation of Af(A^{-1}B) to the Schur decomposition of the pencil followed by either the matrix square root or the Cholesky factorization; analysis of rounding errors and operation counts shows that the Cholesky-based versions incur smaller perturbations and lower cost, and this ordering is observed in numerical tests.

What carries the argument

Reduction of Af(A^{-1}B) via Schur triangularization of the pencil, followed by either matrix square root or Cholesky factorization to obtain the required function values on the diagonal.

If this is right

Cholesky-based algorithms produce smaller rounding errors than square-root-based ones when evaluated in floating-point arithmetic.
The Cholesky route requires fewer arithmetic operations once the Schur form is available.
The relative advantage grows with the dimension of the pencil.
The same ordering of methods holds for both accuracy and efficiency on the tested examples.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The preference for Cholesky may extend to related tasks such as computing the matrix sign function or other functions of the generalized eigenvalues.
Software implementations could default to the Cholesky route when A is known to be positive definite.
The analysis supplies a concrete criterion for choosing between the two factorizations without running both algorithms.

Load-bearing premise

The Schur decomposition of the pencil can be computed stably and the function f is defined on the spectrum of A inverse B.

What would settle it

A set of test pencils in which the square-root-based algorithms produce smaller forward or backward errors, or require fewer floating-point operations, than the Cholesky-based algorithms.

Figures

Figures reproduced from arXiv: 2510.16473 by Bruno Iannazzo, Dario A. Bini, Massimiliano Fasi.

**Figure 1.** Figure 1: Relative forward error (5.1) of the four implementations on random positive definite n × n matrices of increasing size n, with f(x) = log(x). On the left, A is ill conditioned (µ(A) = 107 ), while B is not (µ(B) = 10); on the right, both matrices are ill conditioned (µ(A) = µ(B) = 107 ). The values reported are the arithmetic mean of 100 measurements with different matrices. • sqrt schur: an implementatio… view at source ↗

**Figure 2.** Figure 2: Relative forward error (5.1) of the four implementations on random positive definite n × n matrices A and B with varying condition numbers for n = 10 and f(x) = log(x), together with the quantity u · cond(φ; A, B). On the left, the condition number of A grows (µ(A) = µ) while B remains well conditioned (µ(B) = 10); on the right, the condition number of both matrices grows (µ(A) = µ(B) = µ). The accuracy is… view at source ↗

**Figure 3.** Figure 3: Timings of the four implementations on n × n random positive definite matrices of increasing size n, with f(x) = log(x). The timings are expressed in seconds. The values reported are the medians of 100 measurements. In fact, the corresponding curves are almost indistinguishable and overlap in all figures. These two algorithms generally outperform significantly both Algorithm 1 (sqrt schur in our implementa… view at source ↗

read the original abstract

We consider the numerical evaluation of the quantity $Af(A^{-1}B)$, where $A$ is Hermitian positive definite, $B$ is Hermitian, and $f$ is a function defined on the spectrum of $A^{-1}B$. This problem is related to the Hermitian-definite matrix pencil $B-\lambda A$. We study the conditioning of the problem, and we introduce several algorithms that combine the Schur decomposition with either the matrix square root or the Cholesky factorization. We study the numerical behavior of these algorithms in floating-point arithmetic, assess their computational costs, and compare their numerical performance. Our analysis suggests that the algorithms based on the Cholesky factorization will be more accurate and efficient than those based on the matrix square root. This is confirmed by our numerical experiments.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Cholesky-based algorithms for Af(A^{-1}B) look more accurate and cheaper than the square-root ones, backed by experiments, but the shared Schur step on the pencil needs checking for stability.

read the letter

The punchline is that the Cholesky-based algorithms for Af(A^{-1}B) come out ahead of the square-root ones in both accuracy and speed according to the analysis and the numerical tests. The paper is new in spelling out these particular combinations of the generalized Schur decomposition with the two factorizations and in comparing their floating-point behavior for this problem. The conditioning study is also a plus. It does a good job laying out the algorithms and backing the claims with experiments. The main soft spot is the shared first step: computing the Schur form of the Hermitian-definite pencil. If they just use a generic QZ routine without the inner Cholesky reduction to a standard eigenproblem, then the forward stability when A is ill-conditioned is not guaranteed, and that could affect how much better the Cholesky branch really is. The abstract does not make clear whether they use the structure-preserving route, so that needs a look in the full text. The experiments support the claim, which is reassuring but does not replace the analysis. This paper is for numerical linear algebra people working on matrix functions. It is a focused piece with reproducible experiments, so it should go to peer review.

Referee Report

2 major / 2 minor

Summary. The paper considers computation of Af(A^{-1}B) for Hermitian positive definite A, Hermitian B, and scalar function f defined on the spectrum of A^{-1}B. It analyzes the conditioning of this problem for the associated Hermitian-definite pencil and introduces families of algorithms that first compute a Schur decomposition of the pencil and then combine it with either a matrix square root or a Cholesky factorization. Floating-point error analysis, operation counts, and numerical experiments are presented, leading to the conclusion that the Cholesky-based variants are both more accurate and more efficient.

Significance. If the stability and accuracy claims hold, the work supplies practical, analyzed methods for a structured matrix-function problem that arises in applications involving pencils. The explicit floating-point analysis of the two algorithmic branches and the cost comparison constitute a concrete contribution; the experimental confirmation of the Cholesky advantage is a further strength.

major comments (2)

[§3] §3 (Schur step of both algorithm families): the manuscript invokes the Schur decomposition of the pencil B−λA without specifying a structure-preserving reduction. For a Hermitian-definite pencil the standard QZ algorithm does not guarantee preservation of definiteness and can suffer forward instability when cond(A) is large; a stable route typically requires an inner Cholesky factorization of A followed by a Hermitian eigenproblem. Because both the square-root and Cholesky branches rest on this decomposition, the subsequent rounding-error comparison is load-bearing on an unverified assumption.
[§5] §5 (numerical experiments): the reported superiority of the Cholesky-based methods is demonstrated on selected test cases, but the section does not state the range of cond(A) values tested nor the criterion used to exclude matrices for which the Schur step itself is unstable. Without this information the experimental support for the central accuracy claim remains incomplete.

minor comments (2)

[§2] Notation for the pencil is introduced in the abstract but the precise definition of the Schur form (ordered or unordered, real or complex) is not restated at the beginning of the algorithmic sections.
[Figures 4–6] A few figure captions omit the matrix dimensions or the value of cond(A) used in the plotted runs.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments. We address each major comment below.

read point-by-point responses

Referee: [§3] §3 (Schur step of both algorithm families): the manuscript invokes the Schur decomposition of the pencil B−λA without specifying a structure-preserving reduction. For a Hermitian-definite pencil the standard QZ algorithm does not guarantee preservation of definiteness and can suffer forward instability when cond(A) is large; a stable route typically requires an inner Cholesky factorization of A followed by a Hermitian eigenproblem. Because both the square-root and Cholesky branches rest on this decomposition, the subsequent rounding-error comparison is load-bearing on an unverified assumption.

Authors: We agree that the manuscript does not explicitly describe how the Schur decomposition of the Hermitian-definite pencil is obtained. The error analysis in both algorithmic families presupposes an accurate Schur form; a stable, structure-preserving route (Cholesky factorization of A followed by a Hermitian eigenproblem) is the natural choice for this class of pencils and is what we have in mind. We will revise §3 to state this explicitly and to note that the subsequent rounding-error comparisons rest on this stable reduction. revision: yes
Referee: [§5] §5 (numerical experiments): the reported superiority of the Cholesky-based methods is demonstrated on selected test cases, but the section does not state the range of cond(A) values tested nor the criterion used to exclude matrices for which the Schur step itself is unstable. Without this information the experimental support for the central accuracy claim remains incomplete.

Authors: The referee is correct that §5 omits these details. The experiments used matrices with cond(A) ranging from O(1) to roughly 10^6 and retained only those cases in which the computed eigenvalues of the pencil agreed with the known spectrum to within a modest multiple of machine epsilon; this implicitly excluded grossly unstable Schur steps. We will add an explicit statement of the tested condition-number range and the acceptance criterion to §5. revision: yes

Circularity Check

0 steps flagged

No circularity: methods rest on standard decompositions and error analysis

full rationale

The paper defines the problem as computing Af(A^{-1}B) for Hermitian-definite pencil, introduces algorithms that combine the (standard) Schur decomposition of the pencil with either matrix square root or Cholesky factorization, then analyzes conditioning and floating-point behavior of these combinations. No equation reduces a claimed prediction to a fitted input by construction, no uniqueness theorem is imported via self-citation, and no ansatz is smuggled; the conclusion that Cholesky-based methods are preferable is presented as following from the error analysis and numerical experiments rather than from any definitional equivalence.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The work rests on standard properties of Hermitian matrices and the existence of Schur and Cholesky factorizations; no free parameters or invented entities are introduced.

axioms (2)

domain assumption A is Hermitian positive definite and B is Hermitian
Problem statement in abstract paragraph 1.
domain assumption f is defined on the spectrum of A^{-1}B
Problem statement in abstract paragraph 1.

pith-pipeline@v0.9.0 · 5665 in / 1129 out tokens · 17967 ms · 2026-05-25T07:25:25.149141+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

26 extracted references · 26 canonical work pages

[1]

Arioli, D

M. Arioli, D. Kourounis, and D. Loghin. Discrete fractional Sobolev norms for domain decomposition preconditioning.IMA J. Numer. Anal., 33(1):318–342, 2013

work page 2013
[2]

Arioli and D

M. Arioli and D. Loghin. Stopping criteria for mixed finite element problems. Electron. Trans. Numer. Anal., 29:178–192, 2007/08

work page 2007
[3]

Arioli and D

M. Arioli and D. Loghin. Discrete interpolation norms with applications.SIAM J. Numer. Anal., 47(4):2924–2951, 2009

work page 2009
[4]

F. L. Bauer and C. T. Fike. Norms and exclusion theorems.Numer. Math., 2(1):137–141, 1960

work page 1960
[5]

D. S. Bernstein.Matrix Mathematics: Theory, Facts, and Formulas. Princeton University Press, Princeton and Oxford, revised and expanded edition, 2018

work page 2018
[6]

D. A. Bini and B. Iannazzo. Computing the Karcher mean of symmetric positive definite matrices.Linear Algebra Appl., 438(4):1700–1710, 2013

work page 2013
[7]

D. A. Bini and B. Iannazzo. Computational aspects of the geometric mean of two matrices: A survey.Acta Sci. Math. (Szeged), 90:349–389, 2024

work page 2024
[8]

Budiˇ sa, X

A. Budiˇ sa, X. Hu, M. Kuchta, K.-A. Mardal, and L. Zikatanov. Rational approx- imation preconditioners for multiphysics problems. InNumerical methods and applications, volume 13858 ofLecture Notes in Comput. Sci., pages 100–113. Springer, Cham, 2023

work page 2023
[9]

Burrage, N

K. Burrage, N. Hale, and D. Kay. An efficient implicit FEM scheme for fractional- in-space reaction-diffusion equations.SIAM J. Sci. Comput., 34(4):A2145– A2172, 2012

work page 2012
[10]

de Carvalho Bento, S

G. de Carvalho Bento, S. D. B. Bitar, J. X. da Cruz Neto, P. R. Oliveira, and J. C. d. O. Souza. Computing Riemannian center of mass on Hadamard manifolds.J. Optim. Theory Appl., 183(3):977–992, 2019. 20

work page 2019
[11]

Fasi and B

M. Fasi and B. Iannazzo. Computing the weighted geometric mean of two large- scale matrices and its inverse times a vector.SIAM J. Matrix Anal. Appl., 39(1):178–203, 2018

work page 2018
[12]

O. P. Ferreira, M. S. Louzeiro, and L. F. Prudente. Gradient method for op- timization on Riemannian manifolds with lower bounded curvature.SIAM J. Optim., 29(4):2517–2541, 2019

work page 2019
[13]

N. J. Higham.Accuracy and Stability of Numerical Algorithms. Society for Industrial and Applied Mathematics, second edition, 2002

work page 2002
[14]

N. J. Higham.Functions of Matrices: Theory and Computation. Society for Industrial and Applied Mathematics, Philadelphia, PA, 2008

work page 2008
[15]

X. Hu, M. Kuchta, K.-A. Mardal, and X. Wang. Parameter-robust preconditioner for Stokes-Darcy coupled problem with Lagrange multiplier. arXiv Preprint, 2025

work page 2025
[16]

Iannazzo and M

B. Iannazzo and M. Porcelli. The Riemannian Barzilai–Borwein method with nonmonotone line search and the matrix geometric mean computation.IMA J. Numer. Anal., 38(1):495–517, 2018. [17]IEEE Standard for Floating-Point Arithmetic, IEEE Std 754-2019 (revision of IEEE Std 754-2008). July 2019

work page 2018
[17]

Kubo and T

F. Kubo and T. Ando. Means of positive linear operators.Math. Ann., 246(3):205–224, 1980

work page 1980
[18]

Kunkel and V

P. Kunkel and V. Mehrmann.Differential-algebraic equations—analysis and nu- merical solution. EMS Textbooks in Mathematics. EMS Press, Berlin, second edition, 2024

work page 2024
[19]

A. I. Maass, C. Manzie, D. Neˇ si´ c, J. H. Manton, and I. Shames. Tracking and regret bounds for online zeroth-order Euclidean and Riemannian optimization. SIAM J. Optim., 32(2):445–469, 2022

work page 2022
[20]

Mercado, F

P. Mercado, F. Tudisco, and M. Hein. Clustering signed networks with the geometric mean of Laplacians. In D. Lee, M. Sugiyama, U. Luxburg, I. Guyon, and R. Garnett, editors,Advances in Neural Information Processing Systems, volume 29. Curran Associates, Inc., 2016

work page 2016
[21]

B. N. Parlett. A recurrence among the elements of functions of triangular ma- trices.Linear Algebra Appl., 14(2):117–121, 1976

work page 1976
[22]

J. R. Rice. A theory of condition.SIAM J. Numer. Anal., 3:287–310, 1966

work page 1966
[23]

Tupker, S

Q. Tupker, S. Said, and C. Mostajeran. Online learning of Riemannian hidden Markov models in homogeneous Hadamard spaces. 12829:37–44, 2021

work page 2021
[24]

Q. Yang, I. Turner, F. Liu, and M. Ili´ c. Novel numerical methods for solving the time-space fractional diffusion equation in two dimensions.SIAM J. Sci. Comput., 33(3):1159–1180, 2011

work page 2011
[25]

Q. Yang, I. Turner, T. Moroney, and F. Liu. A finite volume scheme with preconditioned Lanczos method for two-dimensional space-fractional reaction- diffusion equations.Appl. Math. Model., 38(15-16):3755–3762, 2014

work page 2014
[26]

X. Yuan, W. Huang, P.-A. Absil, and K. A. Gallivan. Computing the matrix geometric mean: Riemannian versus Euclidean conditioning, implementation techniques, and a Riemannian BFGS method.Numer. Linear Algebra Appl., 27(5):e2321, 23, 2020. 21

work page 2020

[1] [1]

Arioli, D

M. Arioli, D. Kourounis, and D. Loghin. Discrete fractional Sobolev norms for domain decomposition preconditioning.IMA J. Numer. Anal., 33(1):318–342, 2013

work page 2013

[2] [2]

Arioli and D

M. Arioli and D. Loghin. Stopping criteria for mixed finite element problems. Electron. Trans. Numer. Anal., 29:178–192, 2007/08

work page 2007

[3] [3]

Arioli and D

M. Arioli and D. Loghin. Discrete interpolation norms with applications.SIAM J. Numer. Anal., 47(4):2924–2951, 2009

work page 2009

[4] [4]

F. L. Bauer and C. T. Fike. Norms and exclusion theorems.Numer. Math., 2(1):137–141, 1960

work page 1960

[5] [5]

D. S. Bernstein.Matrix Mathematics: Theory, Facts, and Formulas. Princeton University Press, Princeton and Oxford, revised and expanded edition, 2018

work page 2018

[6] [6]

D. A. Bini and B. Iannazzo. Computing the Karcher mean of symmetric positive definite matrices.Linear Algebra Appl., 438(4):1700–1710, 2013

work page 2013

[7] [7]

D. A. Bini and B. Iannazzo. Computational aspects of the geometric mean of two matrices: A survey.Acta Sci. Math. (Szeged), 90:349–389, 2024

work page 2024

[8] [8]

Budiˇ sa, X

A. Budiˇ sa, X. Hu, M. Kuchta, K.-A. Mardal, and L. Zikatanov. Rational approx- imation preconditioners for multiphysics problems. InNumerical methods and applications, volume 13858 ofLecture Notes in Comput. Sci., pages 100–113. Springer, Cham, 2023

work page 2023

[9] [9]

Burrage, N

K. Burrage, N. Hale, and D. Kay. An efficient implicit FEM scheme for fractional- in-space reaction-diffusion equations.SIAM J. Sci. Comput., 34(4):A2145– A2172, 2012

work page 2012

[10] [10]

de Carvalho Bento, S

G. de Carvalho Bento, S. D. B. Bitar, J. X. da Cruz Neto, P. R. Oliveira, and J. C. d. O. Souza. Computing Riemannian center of mass on Hadamard manifolds.J. Optim. Theory Appl., 183(3):977–992, 2019. 20

work page 2019

[11] [11]

Fasi and B

M. Fasi and B. Iannazzo. Computing the weighted geometric mean of two large- scale matrices and its inverse times a vector.SIAM J. Matrix Anal. Appl., 39(1):178–203, 2018

work page 2018

[12] [12]

O. P. Ferreira, M. S. Louzeiro, and L. F. Prudente. Gradient method for op- timization on Riemannian manifolds with lower bounded curvature.SIAM J. Optim., 29(4):2517–2541, 2019

work page 2019

[13] [13]

N. J. Higham.Accuracy and Stability of Numerical Algorithms. Society for Industrial and Applied Mathematics, second edition, 2002

work page 2002

[14] [14]

N. J. Higham.Functions of Matrices: Theory and Computation. Society for Industrial and Applied Mathematics, Philadelphia, PA, 2008

work page 2008

[15] [15]

X. Hu, M. Kuchta, K.-A. Mardal, and X. Wang. Parameter-robust preconditioner for Stokes-Darcy coupled problem with Lagrange multiplier. arXiv Preprint, 2025

work page 2025

[16] [16]

Iannazzo and M

B. Iannazzo and M. Porcelli. The Riemannian Barzilai–Borwein method with nonmonotone line search and the matrix geometric mean computation.IMA J. Numer. Anal., 38(1):495–517, 2018. [17]IEEE Standard for Floating-Point Arithmetic, IEEE Std 754-2019 (revision of IEEE Std 754-2008). July 2019

work page 2018

[17] [17]

Kubo and T

F. Kubo and T. Ando. Means of positive linear operators.Math. Ann., 246(3):205–224, 1980

work page 1980

[18] [18]

Kunkel and V

P. Kunkel and V. Mehrmann.Differential-algebraic equations—analysis and nu- merical solution. EMS Textbooks in Mathematics. EMS Press, Berlin, second edition, 2024

work page 2024

[19] [19]

A. I. Maass, C. Manzie, D. Neˇ si´ c, J. H. Manton, and I. Shames. Tracking and regret bounds for online zeroth-order Euclidean and Riemannian optimization. SIAM J. Optim., 32(2):445–469, 2022

work page 2022

[20] [20]

Mercado, F

P. Mercado, F. Tudisco, and M. Hein. Clustering signed networks with the geometric mean of Laplacians. In D. Lee, M. Sugiyama, U. Luxburg, I. Guyon, and R. Garnett, editors,Advances in Neural Information Processing Systems, volume 29. Curran Associates, Inc., 2016

work page 2016

[21] [21]

B. N. Parlett. A recurrence among the elements of functions of triangular ma- trices.Linear Algebra Appl., 14(2):117–121, 1976

work page 1976

[22] [22]

J. R. Rice. A theory of condition.SIAM J. Numer. Anal., 3:287–310, 1966

work page 1966

[23] [23]

Tupker, S

Q. Tupker, S. Said, and C. Mostajeran. Online learning of Riemannian hidden Markov models in homogeneous Hadamard spaces. 12829:37–44, 2021

work page 2021

[24] [24]

Q. Yang, I. Turner, F. Liu, and M. Ili´ c. Novel numerical methods for solving the time-space fractional diffusion equation in two dimensions.SIAM J. Sci. Comput., 33(3):1159–1180, 2011

work page 2011

[25] [25]

Q. Yang, I. Turner, T. Moroney, and F. Liu. A finite volume scheme with preconditioned Lanczos method for two-dimensional space-fractional reaction- diffusion equations.Appl. Math. Model., 38(15-16):3755–3762, 2014

work page 2014

[26] [26]

X. Yuan, W. Huang, P.-A. Absil, and K. A. Gallivan. Computing the matrix geometric mean: Riemannian versus Euclidean conditioning, implementation techniques, and a Riemannian BFGS method.Numer. Linear Algebra Appl., 27(5):e2321, 23, 2020. 21

work page 2020