A Theory of the NEPv Approach for Optimization On the Stiefel Manifold

Ren-Cang Li

arxiv: 2305.00091 · v6 · submitted 2023-04-28 · 🧮 math.OC

A Theory of the NEPv Approach for Optimization On the Stiefel Manifold

Ren-Cang Li This is my paper

Pith reviewed 2026-05-24 08:44 UTC · model grok-4.3

classification 🧮 math.OC

keywords Stiefel manifoldNEPvSCF iterationoptimizationconvergenceatomic functionsmanifold optimization

0 comments

The pith

A unifying framework guarantees global convergence to a stationary point for NEPv optimization on the Stiefel manifold when the objective meets basic assumptions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a general theory to justify the NEPv approach, which converts the first-order optimality condition into a nonlinear eigenvalue problem solved by SCF iteration. Under the basic assumptions the iteration is shown to increase the objective monotonically at every step while converging globally to a stationary point. Atomic functions, which include common matrix traces of linear and quadratic forms, and convex compositions of atomic functions are proven to satisfy the assumptions, supplying a sizable collection of objectives for which the method is reliable.

Core claim

If the objective satisfies the basic assumptions, the SCF iteration for the associated NEPv or NPDo produces a monotonically increasing sequence of objective values and converges globally to a stationary point. Atomic functions that cover traces of linear and quadratic forms satisfy the assumptions, as do convex compositions of atomic functions, thereby justifying the NEPv/NPDo approach for this large class of objectives.

What carries the argument

The unifying framework built on basic assumptions, with the notion of atomic functions as the central examples that meet those assumptions.

Load-bearing premise

The objective function must satisfy certain basic assumptions whose precise requirements and verification procedure remain unspecified.

What would settle it

An explicit objective function that meets the basic assumptions yet produces an SCF sequence that fails to increase the objective monotonically or fails to reach a stationary point.

read the original abstract

The NEPv approach has been increasingly used lately for optimization on the Stiefel manifold arising from machine learning. General speaking, the approach first turns the first order optimality condition, also known as the KKT condition, into a nonlinear eigenvalue problem with eigenvector dependency (NEPv) or a nonlinear polar decomposition with orthogonal factor dependency (NPDo) and then solve the nonlinear problem via some variations of the self-consistent-field (SCF) iteration. The difficulty, however, lies in designing a proper SCF iteration so that a maximizer is found at the end. Currently, each use of the approach is very much individualized, especially in its convergence analysis to show that the approach does work or otherwise. In this paper, a unifying framework is established. The framework is built upon some basic assumptions. If the basic assumptions are satisfied, globally convergence is guaranteed to a stationary point and during the SCF iterative process that leads to the stationary point, the objective function increases monotonically. Also a notion of atomic functions is proposed, which include commonly used matrix traces of linear and quadratic forms as special ones. It is shown that the basic assumptions are satisfied by atomic functions and by convex compositions of atomic functions. Together they provide a large collection of objectives for which the NEPv/NPDo approach is guaranteed to work.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper sets up a unifying convergence theory for NEPv/SCF iterations on Stiefel under assumptions that hold for atomic functions and convex compositions of them.

read the letter

The main thing this paper does is replace individualized convergence arguments for the NEPv approach with one set of basic assumptions. Under those assumptions the SCF iteration increases the objective at every step and converges globally to a stationary point. The authors introduce atomic functions that cover the usual trace-linear and trace-quadratic objectives, then show that convex compositions of atomic functions also meet the assumptions. This gives a single template that covers a sizable collection of objectives arising in machine learning manifold problems. That organization is the concrete advance. The argument is self-contained for the class they define, with no visible circularity or load-bearing gaps once the assumptions are granted. The stress-test note is consistent with what is described: the logic verifies the assumptions for atomic functions rather than assuming them without check. The main open question is how restrictive or easy to verify the basic assumptions turn out to be for objectives outside the atomic and convex-composition cases. If verification is routine, the result extends readily; if it is not, the practical scope stays narrower than the abstract suggests. Readers working on numerical methods for orthogonal constraints or manifold optimization will find the framework useful as a reference point. It is worth sending to referees because it addresses a real gap in the existing literature even if the assumptions will need careful examination in any revision.

Referee Report

0 major / 2 minor

Summary. The manuscript develops a unifying framework for the NEPv/NPDo approach to optimization on the Stiefel manifold. Under a set of basic assumptions on the objective, the SCF iteration is proven to produce a monotonic increase in the objective value and to converge globally to a stationary point. The paper introduces the class of atomic functions (encompassing trace-linear and trace-quadratic forms as special cases) and shows that the basic assumptions hold for atomic functions as well as for convex compositions of atomic functions, thereby covering a broad collection of objectives arising in machine learning.

Significance. If the central claims hold, the work supplies a general convergence theory that replaces case-by-case analyses for NEPv methods on the Stiefel manifold. The explicit verification that the assumptions are satisfied by atomic functions and their convex compositions is a concrete strength, as it identifies a sizable, readily applicable class of objectives for which the SCF iteration is guaranteed to behave as claimed.

minor comments (2)

[Abstract] The abstract states that the framework is 'built upon some basic assumptions' but does not list them; a brief enumeration of the assumptions (with forward references to their precise statements in §2 or §3) would improve readability for readers who wish to check applicability without reading the full proofs first.
[§1] Notation for the Stiefel manifold and the orthogonal factor in the NPDo formulation should be introduced once with a single consistent symbol (e.g., X or Q) rather than switching between several symbols across sections.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive assessment of the manuscript, the accurate summary of its contributions, and the recommendation of minor revision. The report correctly identifies the unifying framework based on basic assumptions, the guaranteed monotonic increase and global convergence to stationary points, and the concrete class of atomic functions together with their convex compositions as the main strengths.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper defines a set of basic assumptions on the objective function, proves that the SCF iteration yields monotonic increase and global convergence to a stationary point whenever the assumptions hold, and separately verifies that the assumptions are satisfied by the explicitly defined class of atomic functions (including trace-linear and trace-quadratic cases) together with their convex compositions. This structure is self-contained: the convergence theorem is conditional on the assumptions, and the verification step is an independent check rather than a reduction of the result to its own inputs, fitted parameters, or self-citations. No load-bearing step reduces by construction to a prior result from the same authors or to a renaming of known patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on the basic assumptions (unspecified in abstract) and the newly introduced atomic functions as the class for which the assumptions hold; no free parameters or external benchmarks are mentioned.

axioms (1)

domain assumption Basic assumptions on the objective function and SCF iteration for global convergence and monotonicity
The convergence guarantee is conditional on these assumptions being satisfied, which are the load-bearing premises of the framework.

invented entities (1)

Atomic functions no independent evidence
purpose: To define a broad class of objective functions (including traces of linear and quadratic forms) that satisfy the basic assumptions
Newly proposed concept in the paper to cover commonly used objectives in the NEPv approach.

pith-pipeline@v0.9.0 · 5756 in / 1315 out tokens · 52319 ms · 2026-05-24T08:44:40.971950+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

81 extracted references · 81 canonical work pages

[1]

Absil, R

P.-A. Absil, R. Mahony, and R. Sepulchre. Optimization Algorithms On Matrix Manifolds . Princeton University Press, Princeton, NJ, 2008

work page 2008
[2]

Anderson, Z

E. Anderson, Z. Bai, C. Bischof, J. Demmel, J. Dongarra, J. Du C roz, A. Greenbaum, S. Ham- marling, A. McKenney, S. Ostrouchov, and D. Sorensen. LAPACK Users’ Guide . SIAM, Philadelphia, 3rd edition, 1999

work page 1999
[3]

Z. Bai, J. Demmel, J. Dongarra, A. Ruhe, and H. van der Vorst (e ditors). Templates for the solution of Algebraic Eigenvalue Problems: A Practical Gui de. SIAM, Philadelphia, 2000

work page 2000
[4]

Bai, R.-C

Z. Bai, R.-C. Li, and D. Lu. Sharp estimation of convergence rate for self-consistent ﬁeld iteration to solve eigenvector-dependent nonlinear eigenvalue pro blems. SIAM J. Matrix Anal. Appl., 43(1):301–327, 2022

work page 2022
[5]

Bai and D

Z. Bai and D. Lu. Variational characterization of monotone non linear eigenvector problems and geometry of self-consistent-ﬁeld iteration. SIAM J. Matrix Anal. Appl. , 46(1):84–111, 2024

work page 2024
[6]

Balogh, T

J. Balogh, T. Csendes, and T. Rapcs’a. Some global optimization p roblems on Stiefel mani- folds. J. Global Optim. , 30:91–101, 2004

work page 2004
[7]

Benner and X

P. Benner and X. Liang. Convergence analysis of vector extend ed locally optimal block pre- conditioned extended conjugate gradient method for computing e xtreme eigenvalues. Numer. Linear Algebra Appl. , 29(6):e2445, 2022. 24 pages

work page 2022
[8]

R. Bhatia. Matrix Analysis . Graduate Texts in Mathematics, vol. 169. Springer, New York, 1996

work page 1996
[9]

Birtea, I

P. Birtea, I. Ca¸ su, and D. Com˘ anescu. First order optimality c onditions and steepest descent algorithm on orthogonal Stiefel manifolds. Opt. Lett. , 13:1773–1791, 2019

work page 2019
[10]

Bolla, G

M. Bolla, G. Michaletzky, G. Tusn´ ady, and M. Ziermann. Extrem a of sums of heterogeneous quadratic forms. Linear Algebra Appl. , 269(1):331–365, 1998

work page 1998
[11]

Borg and J

I. Borg and J. Lingoes. Multidimensional Similarity Structure Analysis . Springer-Verlag, New York, 1987

work page 1987
[12]

Boumal, B

N. Boumal, B. Mishra, P.-A. Absil, and R. Sepulchre. Manopt, a Ma tlab toolbox for opti- mization on manifolds. J. Mach. Learning Res. , 15(42):1455–1459, 2014

work page 2014
[13]

Boyd and L

S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, Cambridge, UK, 2004

work page 2004
[14]

Cai, L.-H

Y. Cai, L.-H. Zhang, Z. Bai, and R.-C. Li. On an eigenvector-depe ndent nonlinear eigenvalue problem. SIAM J. Matrix Anal. Appl. , 39(3):1360–1382, 2018

work page 2018
[15]

M. T. Chu and N. T. Trendaﬁlov. The orthogonally constrained r egression revisited. J. Comput. Graph. Stat. , 10(4):746–771, 2001

work page 2001
[16]

J. P. Cunningham and Z. Ghahramani. Linear dimensionality reduc tion: Survey, insights, and generalizations. J. Mach. Learning Res. , 16:2859–2900, 2015

work page 2015
[17]

J. Demmel. Applied Numerical Linear Algebra . SIAM, Philadelphia, PA, 1997

work page 1997
[18]

Edelman, T

A. Edelman, T. A. Arias, and S. T. Smith. The geometry of algorit hms with orthogonality constraints. SIAM J. Matrix Anal. Appl. , 20(2):303–353, 1999. 84

work page 1999
[19]

Eld´ en and H

L. Eld´ en and H. Park. A procrustes problem on the Stiefel man ifold. Numer. Math. , 82:599– 619, 1999

work page 1999
[20]

K. Fan. On a theorem of Weyl concerning eigenvalues of linear tr ansformations. I. Proc. Natl. Acad. Sci. USA , 35(11):pp. 652–655, 1949

work page 1949
[21]

R. A. Fisher. The use of multiple measurements in taxonomic prob lems. Ann. Eugenics , 7(2):179–188, 1936

work page 1936
[22]

B. Gao, X. Liu, X. Chen, and Y.-X. Yuan. A new ﬁrst-order algor ithmic framework for optimization problems with orthogonality constraints. SIAM J. Optim. , 28(1):302–332, 2018

work page 2018
[23]

Golub and Q

G. Golub and Q. Ye. An inverse free preconditioned Krylov subsp ace methods for symmetric eigenvalue problems. SIAM J. Sci. Comput. , 24:312–334, 2002

work page 2002
[24]

G. H. Golub and C. F. Van Loan. Matrix Computations . Johns Hopkins University Press, Baltimore, Maryland, 4th edition, 2013

work page 2013
[25]

J. C. Gower and G. B. Dijksterhuis. Procrustes Problems. Oxford University Press, New York, 2004

work page 2004
[26]

N. J. Higham. Functions of Matrices: Theory and Computation . Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 2008

work page 2008
[27]

Hohenberg and W

P. Hohenberg and W. Kohn. Inhomogeneous electron gas. Phys. Rev. , 136:B864–B871, 1964

work page 1964
[28]

R. A. Horn and C. R. Johnson. Topics in Matrix Analysis . Cambridge University Press, Cambridge, 1991

work page 1991
[29]

R. A. Horn and C. R. Johnson. Matrix Analysis . Cambridge University Press, New York, NY, 2nd edition, 2013

work page 2013
[30]

J. R. Hurley and R. B. Cattell. The Procrustes program: produ cing direct rotation to test a hypothesized factor structure. Comput. Behav. Sci. , 7:258–262, 1962

work page 1962
[31]

Imakura, R.-C

A. Imakura, R.-C. Li, and S.-L. Zhang. Locally optimal and heavy ball GMRES methods. Japan J. Indust. Appl. Math. , 33:471–499, 2016

work page 2016
[32]

Kanzow and H.-D

C. Kanzow and H.-D. Qi. A QP-free constrained Newton-type me thod for variational inequal- ity problems. Math. Program., 85:81–106, 1999

work page 1999
[33]

A. V. Knyazev. Toward the optimal preconditioned eigensolver : Locally optimal block pre- conditioned conjugate gradient method. SIAM J. Sci. Comput. , 23(2):517–541, 2001

work page 2001
[34]

Kohn and L

W. Kohn and L. J. Sham. Self-consistent equations including exc hange and correlation eﬀects. Phys. Rev. , 140:A1133–A1138, 1965

work page 1965
[35]

Kovaˇ c-Striko and K

J. Kovaˇ c-Striko and K. Veseli´ c. Some remarks on the spect ra of Hermitian matrices. Linear Algebra Appl., 145:221–229, 1991

work page 1991
[36]

Lehoucq, D

R. Lehoucq, D. C. Sorensen, and C. Yang. ARPACK User’s Guide . SIAM, Philadelphia, 1998

work page 1998
[37]

R.-C. Li. A perturbation bound for the generalized polar decomp osition. BIT, 33:304–308, 1993

work page 1993
[38]

R.-C. Li. New perturbation bounds for the unitary polar factor . SIAM J. Matrix Anal. Appl. , 16:327–332, 1995. 85

work page 1995
[39]

R.-C. Li. Accuracy of computed eigenvectors via optimizing a Ray leigh quotient. BIT, 44(3):585–593, 2004

work page 2004
[40]

R.-C. Li. Matrix perturbation theory. In L. Hogben, R. Brualdi, and G. W. Stewart, editors, Handbook of Linear Algebra, page Chapter 21. CRC Press, Boca Raton, FL, 2nd edition, 2014

work page 2014
[41]

R.-C. Li. Rayleigh quotient based optimization methods for eigenv alue problems. In Z. Bai, Weiguo Gao, and Yangfeng Su, editors, Matrix Functions and Matrix Equations , volume 19 of Series in Contemporary Applied Mathematics , pages 76–108. World Scientiﬁc, Singapore, 2015

work page 2015
[42]

Liang and R.-C

X. Liang and R.-C. Li. On generalizing trace minimization principles, I I. Linear Algebra Appl., 687:8–37, 2024

work page 2024
[43]

Liang, R.-C

X. Liang, R.-C. Li, and Z. Bai. Trace minimization principles for posit ive semi-deﬁnite pencils. Linear Algebra Appl. , 438:3085–3106, 2013

work page 2013
[44]

Liang, L

X. Liang, L. Wang, L.-H. Zhang, and R.-C. Li. On generalizing trac e minimization principles. Linear Algebra Appl. , 656:483–509, 2023

work page 2023
[45]

Liu, X.-F

X.-G. Liu, X.-F. Wang, and W.-G. Wang. Maximization of matrix trac e function of product Stiefel manifolds. SIAM J. Matrix Anal. Appl. , 36(4):1489–1506, 2015

work page 2015
[46]

Lu and R.-C

D. Lu and R.-C. Li. Locally unitarily invariantizable NEPv and conver gence analysis of SCF. Math. Comp. , 93(349):2291–2329, 2024. Published electronically: January 9, 2 024

work page 2024
[47]

Mor´ e and D

J. Mor´ e and D. Sorensen. Computing a trust region step. SIAM J. Sci. Statist. Comput. , 4(3):553–572, 1983

work page 1983
[48]

T. Ngo, M. Bellalij, and Y. Saad. The trace ratio optimization prob lem for dimensionality reduction. SIAM J. Matrix Anal. Appl. , 31(5):2950–2971, 2010

work page 2010
[49]

F. Nie, R. Zhang, and X. Li. A generalized power iteration method for solving quadratic problem on the Stiefel manifold. SCIENCE CHINA Info. Sci. , 60(11):1–10, 2017

work page 2017
[50]

Nocedal and S

J. Nocedal and S. Wright. Numerical Optimization . Springer, 2nd edition, 2006

work page 2006
[51]

B. N. Parlett. The Symmetric Eigenvalue Problem . SIAM, Philadelphia, 1998. This SIAM edition is an unabridged, corrected reproduction of the work ﬁrst published by Prentice-Hall, Inc., Englewood Cliﬀs, New Jersey, 1980

work page 1998
[52]

B. T. Polyak. Introduction to Optimization . Optimization Software, New York, 1987

work page 1987
[53]

Quillen and Q

P. Quillen and Q. Ye. A block inverse-free preconditioned Krylov s ubspace method for sym- metric generalized eigenvalue problems. J. Comput. Appl. Math. , 233(5):1298–1313, 2010

work page 2010
[54]

Rapcs´ ak

T. Rapcs´ ak. On minimization on Stiefel manifolds. European J. Oper. Res. , 143(2):365–376, 2002

work page 2002
[55]

J. D. Rutter. A serial implementation of Cuppen’s divide and conq uer algorithm for the symmetric eigenvalue problem. Technical Report UCB/CSD-94-799 , EECS Department, Uni- versity of California, Berkeley, February 1994

work page 1994
[56]

Y. Saad. Numerical Methods for Large Eigenvalue Problems . Manchester University Press, Manchester, UK, 1992

work page 1992
[57]

Y. Saad, J. R. Chelikowsky, and S. M. Shontz. Numerical metho ds for electronic structure calculations of materials. SIAM Rev. , 52(1):3–54, 2010. 86

work page 2010
[58]

G. W. Stewart and J. G. Sun. Matrix Perturbation Theory . Academic Press, Boston, 1990

work page 1990
[59]

J. G. Sun. Matrix Perturbation Analysis . Graduate Texts (Academia, Sinica). Science Pub- lisher, Beijing, 2nd edition, November 2001. in Chinese

work page 2001
[60]

Takahashi

I. Takahashi. A note on the conjugate gradient method. Inform. Process. Japan , 5:45–49, 1965

work page 1965
[61]

J. M. F. Ten Berge. Generalized approaches to the MAXBET pro blem and the MAXDIFF problem, with applications to canonical correlations. Psychometrika, 53(4):487–494, 1984

work page 1984
[62]

Teng and R.-C

Z. Teng and R.-C. Li. Variations of orthonormal basis matrices o f subspaces. Numer. Alg., Contr. Optim. , 2024. to appear

work page 2024
[63]

J. P. Van de Geer. Linear relations among k sets of variables. Psychometrika, 49(1):70–94, 1984

work page 1984
[64]

von Neumann

J. von Neumann. Some matrix-inequalities and metrization of mat rix-space. Tomck. Univ. Rev., 1:286–300, 1937

work page 1937
[65]

L. Wang, B. Gao, and X. Liu. Multipliers correction methods for o ptimization problems over the Stiefel manifold. CSIAM Trans. Appl. Math. , 2(3):508–531, 2021

work page 2021
[66]

Wang, L.-H

L. Wang, L.-H. Zhang, and R.-C. Li. Maximizing sum of coupled trac es with applications. Numer. Math. , 152:587–629, 2022. doi.org/10.1007/s00211-022-01322-y

work page doi:10.1007/s00211-022-01322-y 2022
[67]

Wang, L.-H

L. Wang, L.-H. Zhang, and R.-C. Li. Trace ratio optimization with a n application to multi- view learning. Math. Program., 201:97–131, 2023. doi.org/10.1007/s10107-022-01900-w

work page doi:10.1007/s10107-022-01900-w 2023
[68]

H. F. Weinberger. Variational Methods for Eigenvalue Approximation , volume 15 of CBMS- NSF Regional Conference Series in Applied Mathematics . SIAM, Philadelphia, 1974

work page 1974
[69]

Wen and W

Z. Wen and W. Yin. A feasible method for optimization with orthogo nality constraints. Math. Program., 142(1-2):397–434, 2013

work page 2013
[70]

C. Yang, J. C. Meza, B. Lee, and L.-W. Wang. KSSOL V—a MATLAB toolbox for solving the Kohn-Sham equations. ACM Trans. Math. Software , 36(2):1–35, 2009

work page 2009
[71]

Yang and R.-C

M. Yang and R.-C. Li. Heavy ball ﬂexible GMRES method for nonsym metric linear systems. J. Comp. Math. , 40(5):715–731, 2021

work page 2021
[72]

Zhang, L.-Z

L.-H. Zhang, L.-Z. Liao, and M. K. Ng. Fast algorithms for the ge neralized Foley-Sammon discriminant analysis. SIAM J. Matrix Anal. Appl. , 31(4):1584–1605, 2010

work page 2010
[73]

Zhang, L.-Z

L.-H. Zhang, L.-Z. Liao, and M. K. Ng. Superlinear convergence of a general algorithm for the generalized Foley-Sammon discriminant analysis. J. Optim. Theory Appl. , 157(3):853–865, 2013

work page 2013
[74]

Zhang, W

L.-H. Zhang, W. H. Yang, C. Shen, and J. Ying. An eigenvalue-ba sed method for the unbal- anced Procrustes problem. SIAM J. Matrix Anal. Appl. , 41(3):957–983, 2020

work page 2020
[75]

Zhang and R.-C

L.-H. Zhang and R.-C. Li. Maximization of the sum of the trace rat io on the Stiefel manifold, I: Theory. SCIENCE CHINA Math. , 57(12):2495–2508, 2014

work page 2014
[76]

Zhang and R.-C

L.-H. Zhang and R.-C. Li. Maximization of the sum of the trace rat io on the Stiefel manifold, II: Computation. SCIENCE CHINA Math. , 58(7):1549–1566, 2015

work page 2015
[77]

Zhang, L

L.-H. Zhang, L. Wang, Z. Bai, and R.-C. Li. A self-consistent-ﬁe ld iteration for orthogonal canonical correlation analysis. IEEE Trans. Pattern Anal. Mach. Intell. , 44(2):890–904, 2022. 87

work page 2022
[78]

Zhang and K

Z. Zhang and K. Du. Successive projection method for solving t he unbalanced Procrustes problem. SCIENCE CHINA Math. , 49(7):971–986, 2006

work page 2006
[79]

Zhang, Z

Z. Zhang, Z. Zhai, and L. Li. Uniform projection for multi-view lea rning. IEEE Trans. Pattern Anal. Mach. Intell. , 39(8):1675–1689, 2017

work page 2017
[80]

H. Zhao, Z. Wang, and F. Nie. Orthogonal least squares regre ssion for feature extraction. Neurocomputing, 216:200–207, 2016

work page 2016

Showing first 80 references.

[1] [1]

Absil, R

P.-A. Absil, R. Mahony, and R. Sepulchre. Optimization Algorithms On Matrix Manifolds . Princeton University Press, Princeton, NJ, 2008

work page 2008

[2] [2]

Anderson, Z

E. Anderson, Z. Bai, C. Bischof, J. Demmel, J. Dongarra, J. Du C roz, A. Greenbaum, S. Ham- marling, A. McKenney, S. Ostrouchov, and D. Sorensen. LAPACK Users’ Guide . SIAM, Philadelphia, 3rd edition, 1999

work page 1999

[3] [3]

Z. Bai, J. Demmel, J. Dongarra, A. Ruhe, and H. van der Vorst (e ditors). Templates for the solution of Algebraic Eigenvalue Problems: A Practical Gui de. SIAM, Philadelphia, 2000

work page 2000

[4] [4]

Bai, R.-C

Z. Bai, R.-C. Li, and D. Lu. Sharp estimation of convergence rate for self-consistent ﬁeld iteration to solve eigenvector-dependent nonlinear eigenvalue pro blems. SIAM J. Matrix Anal. Appl., 43(1):301–327, 2022

work page 2022

[5] [5]

Bai and D

Z. Bai and D. Lu. Variational characterization of monotone non linear eigenvector problems and geometry of self-consistent-ﬁeld iteration. SIAM J. Matrix Anal. Appl. , 46(1):84–111, 2024

work page 2024

[6] [6]

Balogh, T

J. Balogh, T. Csendes, and T. Rapcs’a. Some global optimization p roblems on Stiefel mani- folds. J. Global Optim. , 30:91–101, 2004

work page 2004

[7] [7]

Benner and X

P. Benner and X. Liang. Convergence analysis of vector extend ed locally optimal block pre- conditioned extended conjugate gradient method for computing e xtreme eigenvalues. Numer. Linear Algebra Appl. , 29(6):e2445, 2022. 24 pages

work page 2022

[8] [8]

R. Bhatia. Matrix Analysis . Graduate Texts in Mathematics, vol. 169. Springer, New York, 1996

work page 1996

[9] [9]

Birtea, I

P. Birtea, I. Ca¸ su, and D. Com˘ anescu. First order optimality c onditions and steepest descent algorithm on orthogonal Stiefel manifolds. Opt. Lett. , 13:1773–1791, 2019

work page 2019

[10] [10]

Bolla, G

M. Bolla, G. Michaletzky, G. Tusn´ ady, and M. Ziermann. Extrem a of sums of heterogeneous quadratic forms. Linear Algebra Appl. , 269(1):331–365, 1998

work page 1998

[11] [11]

Borg and J

I. Borg and J. Lingoes. Multidimensional Similarity Structure Analysis . Springer-Verlag, New York, 1987

work page 1987

[12] [12]

Boumal, B

N. Boumal, B. Mishra, P.-A. Absil, and R. Sepulchre. Manopt, a Ma tlab toolbox for opti- mization on manifolds. J. Mach. Learning Res. , 15(42):1455–1459, 2014

work page 2014

[13] [13]

Boyd and L

S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, Cambridge, UK, 2004

work page 2004

[14] [14]

Cai, L.-H

Y. Cai, L.-H. Zhang, Z. Bai, and R.-C. Li. On an eigenvector-depe ndent nonlinear eigenvalue problem. SIAM J. Matrix Anal. Appl. , 39(3):1360–1382, 2018

work page 2018

[15] [15]

M. T. Chu and N. T. Trendaﬁlov. The orthogonally constrained r egression revisited. J. Comput. Graph. Stat. , 10(4):746–771, 2001

work page 2001

[16] [16]

J. P. Cunningham and Z. Ghahramani. Linear dimensionality reduc tion: Survey, insights, and generalizations. J. Mach. Learning Res. , 16:2859–2900, 2015

work page 2015

[17] [17]

J. Demmel. Applied Numerical Linear Algebra . SIAM, Philadelphia, PA, 1997

work page 1997

[18] [18]

Edelman, T

A. Edelman, T. A. Arias, and S. T. Smith. The geometry of algorit hms with orthogonality constraints. SIAM J. Matrix Anal. Appl. , 20(2):303–353, 1999. 84

work page 1999

[19] [19]

Eld´ en and H

L. Eld´ en and H. Park. A procrustes problem on the Stiefel man ifold. Numer. Math. , 82:599– 619, 1999

work page 1999

[20] [20]

K. Fan. On a theorem of Weyl concerning eigenvalues of linear tr ansformations. I. Proc. Natl. Acad. Sci. USA , 35(11):pp. 652–655, 1949

work page 1949

[21] [21]

R. A. Fisher. The use of multiple measurements in taxonomic prob lems. Ann. Eugenics , 7(2):179–188, 1936

work page 1936

[22] [22]

B. Gao, X. Liu, X. Chen, and Y.-X. Yuan. A new ﬁrst-order algor ithmic framework for optimization problems with orthogonality constraints. SIAM J. Optim. , 28(1):302–332, 2018

work page 2018

[23] [23]

Golub and Q

G. Golub and Q. Ye. An inverse free preconditioned Krylov subsp ace methods for symmetric eigenvalue problems. SIAM J. Sci. Comput. , 24:312–334, 2002

work page 2002

[24] [24]

G. H. Golub and C. F. Van Loan. Matrix Computations . Johns Hopkins University Press, Baltimore, Maryland, 4th edition, 2013

work page 2013

[25] [25]

J. C. Gower and G. B. Dijksterhuis. Procrustes Problems. Oxford University Press, New York, 2004

work page 2004

[26] [26]

N. J. Higham. Functions of Matrices: Theory and Computation . Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 2008

work page 2008

[27] [27]

Hohenberg and W

P. Hohenberg and W. Kohn. Inhomogeneous electron gas. Phys. Rev. , 136:B864–B871, 1964

work page 1964

[28] [28]

R. A. Horn and C. R. Johnson. Topics in Matrix Analysis . Cambridge University Press, Cambridge, 1991

work page 1991

[29] [29]

R. A. Horn and C. R. Johnson. Matrix Analysis . Cambridge University Press, New York, NY, 2nd edition, 2013

work page 2013

[30] [30]

J. R. Hurley and R. B. Cattell. The Procrustes program: produ cing direct rotation to test a hypothesized factor structure. Comput. Behav. Sci. , 7:258–262, 1962

work page 1962

[31] [31]

Imakura, R.-C

A. Imakura, R.-C. Li, and S.-L. Zhang. Locally optimal and heavy ball GMRES methods. Japan J. Indust. Appl. Math. , 33:471–499, 2016

work page 2016

[32] [32]

Kanzow and H.-D

C. Kanzow and H.-D. Qi. A QP-free constrained Newton-type me thod for variational inequal- ity problems. Math. Program., 85:81–106, 1999

work page 1999

[33] [33]

A. V. Knyazev. Toward the optimal preconditioned eigensolver : Locally optimal block pre- conditioned conjugate gradient method. SIAM J. Sci. Comput. , 23(2):517–541, 2001

work page 2001

[34] [34]

Kohn and L

W. Kohn and L. J. Sham. Self-consistent equations including exc hange and correlation eﬀects. Phys. Rev. , 140:A1133–A1138, 1965

work page 1965

[35] [35]

Kovaˇ c-Striko and K

J. Kovaˇ c-Striko and K. Veseli´ c. Some remarks on the spect ra of Hermitian matrices. Linear Algebra Appl., 145:221–229, 1991

work page 1991

[36] [36]

Lehoucq, D

R. Lehoucq, D. C. Sorensen, and C. Yang. ARPACK User’s Guide . SIAM, Philadelphia, 1998

work page 1998

[37] [37]

R.-C. Li. A perturbation bound for the generalized polar decomp osition. BIT, 33:304–308, 1993

work page 1993

[38] [38]

R.-C. Li. New perturbation bounds for the unitary polar factor . SIAM J. Matrix Anal. Appl. , 16:327–332, 1995. 85

work page 1995

[39] [39]

R.-C. Li. Accuracy of computed eigenvectors via optimizing a Ray leigh quotient. BIT, 44(3):585–593, 2004

work page 2004

[40] [40]

R.-C. Li. Matrix perturbation theory. In L. Hogben, R. Brualdi, and G. W. Stewart, editors, Handbook of Linear Algebra, page Chapter 21. CRC Press, Boca Raton, FL, 2nd edition, 2014

work page 2014

[41] [41]

R.-C. Li. Rayleigh quotient based optimization methods for eigenv alue problems. In Z. Bai, Weiguo Gao, and Yangfeng Su, editors, Matrix Functions and Matrix Equations , volume 19 of Series in Contemporary Applied Mathematics , pages 76–108. World Scientiﬁc, Singapore, 2015

work page 2015

[42] [42]

Liang and R.-C

X. Liang and R.-C. Li. On generalizing trace minimization principles, I I. Linear Algebra Appl., 687:8–37, 2024

work page 2024

[43] [43]

Liang, R.-C

X. Liang, R.-C. Li, and Z. Bai. Trace minimization principles for posit ive semi-deﬁnite pencils. Linear Algebra Appl. , 438:3085–3106, 2013

work page 2013

[44] [44]

Liang, L

X. Liang, L. Wang, L.-H. Zhang, and R.-C. Li. On generalizing trac e minimization principles. Linear Algebra Appl. , 656:483–509, 2023

work page 2023

[45] [45]

Liu, X.-F

X.-G. Liu, X.-F. Wang, and W.-G. Wang. Maximization of matrix trac e function of product Stiefel manifolds. SIAM J. Matrix Anal. Appl. , 36(4):1489–1506, 2015

work page 2015

[46] [46]

Lu and R.-C

D. Lu and R.-C. Li. Locally unitarily invariantizable NEPv and conver gence analysis of SCF. Math. Comp. , 93(349):2291–2329, 2024. Published electronically: January 9, 2 024

work page 2024

[47] [47]

Mor´ e and D

J. Mor´ e and D. Sorensen. Computing a trust region step. SIAM J. Sci. Statist. Comput. , 4(3):553–572, 1983

work page 1983

[48] [48]

T. Ngo, M. Bellalij, and Y. Saad. The trace ratio optimization prob lem for dimensionality reduction. SIAM J. Matrix Anal. Appl. , 31(5):2950–2971, 2010

work page 2010

[49] [49]

F. Nie, R. Zhang, and X. Li. A generalized power iteration method for solving quadratic problem on the Stiefel manifold. SCIENCE CHINA Info. Sci. , 60(11):1–10, 2017

work page 2017

[50] [50]

Nocedal and S

J. Nocedal and S. Wright. Numerical Optimization . Springer, 2nd edition, 2006

work page 2006

[51] [51]

B. N. Parlett. The Symmetric Eigenvalue Problem . SIAM, Philadelphia, 1998. This SIAM edition is an unabridged, corrected reproduction of the work ﬁrst published by Prentice-Hall, Inc., Englewood Cliﬀs, New Jersey, 1980

work page 1998

[52] [52]

B. T. Polyak. Introduction to Optimization . Optimization Software, New York, 1987

work page 1987

[53] [53]

Quillen and Q

P. Quillen and Q. Ye. A block inverse-free preconditioned Krylov s ubspace method for sym- metric generalized eigenvalue problems. J. Comput. Appl. Math. , 233(5):1298–1313, 2010

work page 2010

[54] [54]

Rapcs´ ak

T. Rapcs´ ak. On minimization on Stiefel manifolds. European J. Oper. Res. , 143(2):365–376, 2002

work page 2002

[55] [55]

J. D. Rutter. A serial implementation of Cuppen’s divide and conq uer algorithm for the symmetric eigenvalue problem. Technical Report UCB/CSD-94-799 , EECS Department, Uni- versity of California, Berkeley, February 1994

work page 1994

[56] [56]

Y. Saad. Numerical Methods for Large Eigenvalue Problems . Manchester University Press, Manchester, UK, 1992

work page 1992

[57] [57]

Y. Saad, J. R. Chelikowsky, and S. M. Shontz. Numerical metho ds for electronic structure calculations of materials. SIAM Rev. , 52(1):3–54, 2010. 86

work page 2010

[58] [58]

G. W. Stewart and J. G. Sun. Matrix Perturbation Theory . Academic Press, Boston, 1990

work page 1990

[59] [59]

J. G. Sun. Matrix Perturbation Analysis . Graduate Texts (Academia, Sinica). Science Pub- lisher, Beijing, 2nd edition, November 2001. in Chinese

work page 2001

[60] [60]

Takahashi

I. Takahashi. A note on the conjugate gradient method. Inform. Process. Japan , 5:45–49, 1965

work page 1965

[61] [61]

J. M. F. Ten Berge. Generalized approaches to the MAXBET pro blem and the MAXDIFF problem, with applications to canonical correlations. Psychometrika, 53(4):487–494, 1984

work page 1984

[62] [62]

Teng and R.-C

Z. Teng and R.-C. Li. Variations of orthonormal basis matrices o f subspaces. Numer. Alg., Contr. Optim. , 2024. to appear

work page 2024

[63] [63]

J. P. Van de Geer. Linear relations among k sets of variables. Psychometrika, 49(1):70–94, 1984

work page 1984

[64] [64]

von Neumann

J. von Neumann. Some matrix-inequalities and metrization of mat rix-space. Tomck. Univ. Rev., 1:286–300, 1937

work page 1937

[65] [65]

L. Wang, B. Gao, and X. Liu. Multipliers correction methods for o ptimization problems over the Stiefel manifold. CSIAM Trans. Appl. Math. , 2(3):508–531, 2021

work page 2021

[66] [66]

Wang, L.-H

L. Wang, L.-H. Zhang, and R.-C. Li. Maximizing sum of coupled trac es with applications. Numer. Math. , 152:587–629, 2022. doi.org/10.1007/s00211-022-01322-y

work page doi:10.1007/s00211-022-01322-y 2022

[67] [67]

Wang, L.-H

L. Wang, L.-H. Zhang, and R.-C. Li. Trace ratio optimization with a n application to multi- view learning. Math. Program., 201:97–131, 2023. doi.org/10.1007/s10107-022-01900-w

work page doi:10.1007/s10107-022-01900-w 2023

[68] [68]

H. F. Weinberger. Variational Methods for Eigenvalue Approximation , volume 15 of CBMS- NSF Regional Conference Series in Applied Mathematics . SIAM, Philadelphia, 1974

work page 1974

[69] [69]

Wen and W

Z. Wen and W. Yin. A feasible method for optimization with orthogo nality constraints. Math. Program., 142(1-2):397–434, 2013

work page 2013

[70] [70]

C. Yang, J. C. Meza, B. Lee, and L.-W. Wang. KSSOL V—a MATLAB toolbox for solving the Kohn-Sham equations. ACM Trans. Math. Software , 36(2):1–35, 2009

work page 2009

[71] [71]

Yang and R.-C

M. Yang and R.-C. Li. Heavy ball ﬂexible GMRES method for nonsym metric linear systems. J. Comp. Math. , 40(5):715–731, 2021

work page 2021

[72] [72]

Zhang, L.-Z

L.-H. Zhang, L.-Z. Liao, and M. K. Ng. Fast algorithms for the ge neralized Foley-Sammon discriminant analysis. SIAM J. Matrix Anal. Appl. , 31(4):1584–1605, 2010

work page 2010

[73] [73]

Zhang, L.-Z

L.-H. Zhang, L.-Z. Liao, and M. K. Ng. Superlinear convergence of a general algorithm for the generalized Foley-Sammon discriminant analysis. J. Optim. Theory Appl. , 157(3):853–865, 2013

work page 2013

[74] [74]

Zhang, W

L.-H. Zhang, W. H. Yang, C. Shen, and J. Ying. An eigenvalue-ba sed method for the unbal- anced Procrustes problem. SIAM J. Matrix Anal. Appl. , 41(3):957–983, 2020

work page 2020

[75] [75]

Zhang and R.-C

L.-H. Zhang and R.-C. Li. Maximization of the sum of the trace rat io on the Stiefel manifold, I: Theory. SCIENCE CHINA Math. , 57(12):2495–2508, 2014

work page 2014

[76] [76]

Zhang and R.-C

L.-H. Zhang and R.-C. Li. Maximization of the sum of the trace rat io on the Stiefel manifold, II: Computation. SCIENCE CHINA Math. , 58(7):1549–1566, 2015

work page 2015

[77] [77]

Zhang, L

L.-H. Zhang, L. Wang, Z. Bai, and R.-C. Li. A self-consistent-ﬁe ld iteration for orthogonal canonical correlation analysis. IEEE Trans. Pattern Anal. Mach. Intell. , 44(2):890–904, 2022. 87

work page 2022

[78] [78]

Zhang and K

Z. Zhang and K. Du. Successive projection method for solving t he unbalanced Procrustes problem. SCIENCE CHINA Math. , 49(7):971–986, 2006

work page 2006

[79] [79]

Zhang, Z

Z. Zhang, Z. Zhai, and L. Li. Uniform projection for multi-view lea rning. IEEE Trans. Pattern Anal. Mach. Intell. , 39(8):1675–1689, 2017

work page 2017

[80] [80]

H. Zhao, Z. Wang, and F. Nie. Orthogonal least squares regre ssion for feature extraction. Neurocomputing, 216:200–207, 2016

work page 2016