pith. sign in

arxiv: 2510.16473 · v2 · pith:3J5WXYNJnew · submitted 2025-10-18 · 🧮 math.NA · cs.NA

Computing matrix functions associated with a Hermitian--definite pencil

Pith reviewed 2026-05-25 07:25 UTC · model grok-4.3

classification 🧮 math.NA cs.NA
keywords matrix functionsHermitian-definite pencilCholesky factorizationmatrix square rootSchur decompositionnumerical stabilitygeneralized eigenvalue problem
0
0 comments X

The pith

Algorithms that combine Schur decomposition with Cholesky factorization compute Af(A^{-1}B) more accurately and efficiently than those using the matrix square root.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines the numerical computation of Af(A^{-1}B) for Hermitian positive definite A, Hermitian B, and a function f defined on the spectrum of A^{-1}B. This quantity is tied to the Hermitian-definite pencil B minus lambda A. The authors analyze the conditioning of the problem and present algorithms that start from the Schur form of the pencil and then apply either a matrix square root or a Cholesky factorization. They compare floating-point behavior and operation counts, concluding that the Cholesky route is preferable on both accuracy and speed. Experiments on random and structured pencils confirm the predicted advantage.

Core claim

Several algorithms are introduced that reduce the evaluation of Af(A^{-1}B) to the Schur decomposition of the pencil followed by either the matrix square root or the Cholesky factorization; analysis of rounding errors and operation counts shows that the Cholesky-based versions incur smaller perturbations and lower cost, and this ordering is observed in numerical tests.

What carries the argument

Reduction of Af(A^{-1}B) via Schur triangularization of the pencil, followed by either matrix square root or Cholesky factorization to obtain the required function values on the diagonal.

If this is right

  • Cholesky-based algorithms produce smaller rounding errors than square-root-based ones when evaluated in floating-point arithmetic.
  • The Cholesky route requires fewer arithmetic operations once the Schur form is available.
  • The relative advantage grows with the dimension of the pencil.
  • The same ordering of methods holds for both accuracy and efficiency on the tested examples.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The preference for Cholesky may extend to related tasks such as computing the matrix sign function or other functions of the generalized eigenvalues.
  • Software implementations could default to the Cholesky route when A is known to be positive definite.
  • The analysis supplies a concrete criterion for choosing between the two factorizations without running both algorithms.

Load-bearing premise

The Schur decomposition of the pencil can be computed stably and the function f is defined on the spectrum of A inverse B.

What would settle it

A set of test pencils in which the square-root-based algorithms produce smaller forward or backward errors, or require fewer floating-point operations, than the Cholesky-based algorithms.

Figures

Figures reproduced from arXiv: 2510.16473 by Bruno Iannazzo, Dario A. Bini, Massimiliano Fasi.

Figure 1
Figure 1. Figure 1: Relative forward error (5.1) of the four implementations on random positive definite n × n matrices of increasing size n, with f(x) = log(x). On the left, A is ill condi￾tioned (µ(A) = 107 ), while B is not (µ(B) = 10); on the right, both matrices are ill conditioned (µ(A) = µ(B) = 107 ). The values reported are the arithmetic mean of 100 measurements with different matrices. • sqrt schur: an implementatio… view at source ↗
Figure 2
Figure 2. Figure 2: Relative forward error (5.1) of the four implementations on random positive definite n × n matrices A and B with varying condition numbers for n = 10 and f(x) = log(x), together with the quantity u · cond(φ; A, B). On the left, the condition number of A grows (µ(A) = µ) while B remains well conditioned (µ(B) = 10); on the right, the condition number of both matrices grows (µ(A) = µ(B) = µ). The accuracy is… view at source ↗
Figure 3
Figure 3. Figure 3: Timings of the four implementations on n × n random positive definite matrices of increasing size n, with f(x) = log(x). The timings are expressed in seconds. The values reported are the medians of 100 measurements. In fact, the corresponding curves are almost indistinguishable and overlap in all figures. These two algorithms generally outperform significantly both Algorithm 1 (sqrt schur in our implementa… view at source ↗
read the original abstract

We consider the numerical evaluation of the quantity $Af(A^{-1}B)$, where $A$ is Hermitian positive definite, $B$ is Hermitian, and $f$ is a function defined on the spectrum of $A^{-1}B$. This problem is related to the Hermitian-definite matrix pencil $B-\lambda A$. We study the conditioning of the problem, and we introduce several algorithms that combine the Schur decomposition with either the matrix square root or the Cholesky factorization. We study the numerical behavior of these algorithms in floating-point arithmetic, assess their computational costs, and compare their numerical performance. Our analysis suggests that the algorithms based on the Cholesky factorization will be more accurate and efficient than those based on the matrix square root. This is confirmed by our numerical experiments.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper considers computation of Af(A^{-1}B) for Hermitian positive definite A, Hermitian B, and scalar function f defined on the spectrum of A^{-1}B. It analyzes the conditioning of this problem for the associated Hermitian-definite pencil and introduces families of algorithms that first compute a Schur decomposition of the pencil and then combine it with either a matrix square root or a Cholesky factorization. Floating-point error analysis, operation counts, and numerical experiments are presented, leading to the conclusion that the Cholesky-based variants are both more accurate and more efficient.

Significance. If the stability and accuracy claims hold, the work supplies practical, analyzed methods for a structured matrix-function problem that arises in applications involving pencils. The explicit floating-point analysis of the two algorithmic branches and the cost comparison constitute a concrete contribution; the experimental confirmation of the Cholesky advantage is a further strength.

major comments (2)
  1. [§3] §3 (Schur step of both algorithm families): the manuscript invokes the Schur decomposition of the pencil B−λA without specifying a structure-preserving reduction. For a Hermitian-definite pencil the standard QZ algorithm does not guarantee preservation of definiteness and can suffer forward instability when cond(A) is large; a stable route typically requires an inner Cholesky factorization of A followed by a Hermitian eigenproblem. Because both the square-root and Cholesky branches rest on this decomposition, the subsequent rounding-error comparison is load-bearing on an unverified assumption.
  2. [§5] §5 (numerical experiments): the reported superiority of the Cholesky-based methods is demonstrated on selected test cases, but the section does not state the range of cond(A) values tested nor the criterion used to exclude matrices for which the Schur step itself is unstable. Without this information the experimental support for the central accuracy claim remains incomplete.
minor comments (2)
  1. [§2] Notation for the pencil is introduced in the abstract but the precise definition of the Schur form (ordered or unordered, real or complex) is not restated at the beginning of the algorithmic sections.
  2. [Figures 4–6] A few figure captions omit the matrix dimensions or the value of cond(A) used in the plotted runs.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments. We address each major comment below.

read point-by-point responses
  1. Referee: [§3] §3 (Schur step of both algorithm families): the manuscript invokes the Schur decomposition of the pencil B−λA without specifying a structure-preserving reduction. For a Hermitian-definite pencil the standard QZ algorithm does not guarantee preservation of definiteness and can suffer forward instability when cond(A) is large; a stable route typically requires an inner Cholesky factorization of A followed by a Hermitian eigenproblem. Because both the square-root and Cholesky branches rest on this decomposition, the subsequent rounding-error comparison is load-bearing on an unverified assumption.

    Authors: We agree that the manuscript does not explicitly describe how the Schur decomposition of the Hermitian-definite pencil is obtained. The error analysis in both algorithmic families presupposes an accurate Schur form; a stable, structure-preserving route (Cholesky factorization of A followed by a Hermitian eigenproblem) is the natural choice for this class of pencils and is what we have in mind. We will revise §3 to state this explicitly and to note that the subsequent rounding-error comparisons rest on this stable reduction. revision: yes

  2. Referee: [§5] §5 (numerical experiments): the reported superiority of the Cholesky-based methods is demonstrated on selected test cases, but the section does not state the range of cond(A) values tested nor the criterion used to exclude matrices for which the Schur step itself is unstable. Without this information the experimental support for the central accuracy claim remains incomplete.

    Authors: The referee is correct that §5 omits these details. The experiments used matrices with cond(A) ranging from O(1) to roughly 10^6 and retained only those cases in which the computed eigenvalues of the pencil agreed with the known spectrum to within a modest multiple of machine epsilon; this implicitly excluded grossly unstable Schur steps. We will add an explicit statement of the tested condition-number range and the acceptance criterion to §5. revision: yes

Circularity Check

0 steps flagged

No circularity: methods rest on standard decompositions and error analysis

full rationale

The paper defines the problem as computing Af(A^{-1}B) for Hermitian-definite pencil, introduces algorithms that combine the (standard) Schur decomposition of the pencil with either matrix square root or Cholesky factorization, then analyzes conditioning and floating-point behavior of these combinations. No equation reduces a claimed prediction to a fitted input by construction, no uniqueness theorem is imported via self-citation, and no ansatz is smuggled; the conclusion that Cholesky-based methods are preferable is presented as following from the error analysis and numerical experiments rather than from any definitional equivalence.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The work rests on standard properties of Hermitian matrices and the existence of Schur and Cholesky factorizations; no free parameters or invented entities are introduced.

axioms (2)
  • domain assumption A is Hermitian positive definite and B is Hermitian
    Problem statement in abstract paragraph 1.
  • domain assumption f is defined on the spectrum of A^{-1}B
    Problem statement in abstract paragraph 1.

pith-pipeline@v0.9.0 · 5665 in / 1129 out tokens · 17967 ms · 2026-05-25T07:25:25.149141+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

26 extracted references · 26 canonical work pages

  1. [1]

    Arioli, D

    M. Arioli, D. Kourounis, and D. Loghin. Discrete fractional Sobolev norms for domain decomposition preconditioning.IMA J. Numer. Anal., 33(1):318–342, 2013

  2. [2]

    Arioli and D

    M. Arioli and D. Loghin. Stopping criteria for mixed finite element problems. Electron. Trans. Numer. Anal., 29:178–192, 2007/08

  3. [3]

    Arioli and D

    M. Arioli and D. Loghin. Discrete interpolation norms with applications.SIAM J. Numer. Anal., 47(4):2924–2951, 2009

  4. [4]

    F. L. Bauer and C. T. Fike. Norms and exclusion theorems.Numer. Math., 2(1):137–141, 1960

  5. [5]

    D. S. Bernstein.Matrix Mathematics: Theory, Facts, and Formulas. Princeton University Press, Princeton and Oxford, revised and expanded edition, 2018

  6. [6]

    D. A. Bini and B. Iannazzo. Computing the Karcher mean of symmetric positive definite matrices.Linear Algebra Appl., 438(4):1700–1710, 2013

  7. [7]

    D. A. Bini and B. Iannazzo. Computational aspects of the geometric mean of two matrices: A survey.Acta Sci. Math. (Szeged), 90:349–389, 2024

  8. [8]

    Budiˇ sa, X

    A. Budiˇ sa, X. Hu, M. Kuchta, K.-A. Mardal, and L. Zikatanov. Rational approx- imation preconditioners for multiphysics problems. InNumerical methods and applications, volume 13858 ofLecture Notes in Comput. Sci., pages 100–113. Springer, Cham, 2023

  9. [9]

    Burrage, N

    K. Burrage, N. Hale, and D. Kay. An efficient implicit FEM scheme for fractional- in-space reaction-diffusion equations.SIAM J. Sci. Comput., 34(4):A2145– A2172, 2012

  10. [10]

    de Carvalho Bento, S

    G. de Carvalho Bento, S. D. B. Bitar, J. X. da Cruz Neto, P. R. Oliveira, and J. C. d. O. Souza. Computing Riemannian center of mass on Hadamard manifolds.J. Optim. Theory Appl., 183(3):977–992, 2019. 20

  11. [11]

    Fasi and B

    M. Fasi and B. Iannazzo. Computing the weighted geometric mean of two large- scale matrices and its inverse times a vector.SIAM J. Matrix Anal. Appl., 39(1):178–203, 2018

  12. [12]

    O. P. Ferreira, M. S. Louzeiro, and L. F. Prudente. Gradient method for op- timization on Riemannian manifolds with lower bounded curvature.SIAM J. Optim., 29(4):2517–2541, 2019

  13. [13]

    N. J. Higham.Accuracy and Stability of Numerical Algorithms. Society for Industrial and Applied Mathematics, second edition, 2002

  14. [14]

    N. J. Higham.Functions of Matrices: Theory and Computation. Society for Industrial and Applied Mathematics, Philadelphia, PA, 2008

  15. [15]

    X. Hu, M. Kuchta, K.-A. Mardal, and X. Wang. Parameter-robust preconditioner for Stokes-Darcy coupled problem with Lagrange multiplier. arXiv Preprint, 2025

  16. [16]

    Iannazzo and M

    B. Iannazzo and M. Porcelli. The Riemannian Barzilai–Borwein method with nonmonotone line search and the matrix geometric mean computation.IMA J. Numer. Anal., 38(1):495–517, 2018. [17]IEEE Standard for Floating-Point Arithmetic, IEEE Std 754-2019 (revision of IEEE Std 754-2008). July 2019

  17. [17]

    Kubo and T

    F. Kubo and T. Ando. Means of positive linear operators.Math. Ann., 246(3):205–224, 1980

  18. [18]

    Kunkel and V

    P. Kunkel and V. Mehrmann.Differential-algebraic equations—analysis and nu- merical solution. EMS Textbooks in Mathematics. EMS Press, Berlin, second edition, 2024

  19. [19]

    A. I. Maass, C. Manzie, D. Neˇ si´ c, J. H. Manton, and I. Shames. Tracking and regret bounds for online zeroth-order Euclidean and Riemannian optimization. SIAM J. Optim., 32(2):445–469, 2022

  20. [20]

    Mercado, F

    P. Mercado, F. Tudisco, and M. Hein. Clustering signed networks with the geometric mean of Laplacians. In D. Lee, M. Sugiyama, U. Luxburg, I. Guyon, and R. Garnett, editors,Advances in Neural Information Processing Systems, volume 29. Curran Associates, Inc., 2016

  21. [21]

    B. N. Parlett. A recurrence among the elements of functions of triangular ma- trices.Linear Algebra Appl., 14(2):117–121, 1976

  22. [22]

    J. R. Rice. A theory of condition.SIAM J. Numer. Anal., 3:287–310, 1966

  23. [23]

    Tupker, S

    Q. Tupker, S. Said, and C. Mostajeran. Online learning of Riemannian hidden Markov models in homogeneous Hadamard spaces. 12829:37–44, 2021

  24. [24]

    Q. Yang, I. Turner, F. Liu, and M. Ili´ c. Novel numerical methods for solving the time-space fractional diffusion equation in two dimensions.SIAM J. Sci. Comput., 33(3):1159–1180, 2011

  25. [25]

    Q. Yang, I. Turner, T. Moroney, and F. Liu. A finite volume scheme with preconditioned Lanczos method for two-dimensional space-fractional reaction- diffusion equations.Appl. Math. Model., 38(15-16):3755–3762, 2014

  26. [26]

    X. Yuan, W. Huang, P.-A. Absil, and K. A. Gallivan. Computing the matrix geometric mean: Riemannian versus Euclidean conditioning, implementation techniques, and a Riemannian BFGS method.Numer. Linear Algebra Appl., 27(5):e2321, 23, 2020. 21