pith. sign in

arxiv: 2605.17793 · v1 · pith:LXDHZQHZnew · submitted 2026-05-18 · 🧮 math.NA · cs.NA

Convergence Analysis of Two Alternating Iterative Schemes for Tucker Decomposition

Pith reviewed 2026-05-20 01:11 UTC · model grok-4.3

classification 🧮 math.NA cs.NA
keywords Tucker decompositionhigher-order orthogonal iterationalternating subspace iterationglobal convergencecomplex tensorsmonotonic objectivestationary points
0
0 comments X

The pith

Two alternating methods for Tucker decomposition converge globally to stationary points for complex tensors while the objective increases monotonically at each step.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proves global convergence for the higher-order orthogonal iteration and alternating subspace iteration schemes used to compute Tucker decompositions. It extends prior real-tensor analyses to the complex case and closes gaps in earlier ASI proofs. A sympathetic reader would care because these guarantees let practitioners run the iterations knowing the objective improves steadily and the process reaches a stationary point rather than cycling or diverging.

Core claim

Both the higher-order orthogonal iteration (including its greedy variant) and the alternating subspace iteration converge globally to stationary points of the Tucker objective for complex tensors under mild conditions, with the objective function strictly increasing throughout the iterations; real tensors are recovered as a special case.

What carries the argument

Monotonic increase of the objective function paired with a global convergence argument that tracks the iterates across the orthogonal constraints of the Tucker factors.

If this is right

  • Users can apply these schemes to complex-valued data with the assurance that the objective will rise and the process will stop at a stationary point.
  • The same monotonicity property supplies a practical stopping criterion based on objective change.
  • The analysis covers both real and complex tensors without requiring separate proofs.
  • Earlier gaps in the 1980 analysis of ASI are now filled.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar monotonicity-plus-global-convergence arguments might apply to other alternating least-squares schemes for tensor factorizations if the objective is bounded above and the factors remain on compact sets.
  • The mild conditions could be made explicit in future work to identify the largest class of tensors for which the guarantees hold.
  • Numerical monitoring of the objective increase offers a simple diagnostic that the iteration is behaving as predicted.

Load-bearing premise

The convergence holds only under mild but unspecified conditions on the input tensor and the iteration parameters.

What would settle it

A concrete complex tensor together with iteration parameters satisfying the stated mild conditions for which either method produces a non-monotonic objective sequence or fails to approach a stationary point.

Figures

Figures reproduced from arXiv: 2605.17793 by Li Wang, Mei Yang, Ren-Cang Li.

Figure 6.1
Figure 6.1. Figure 6.1: Performance statistics on the Miranda tensor: core tensor size 4s×s×s for 4 ≤ s ≤ 20. Left: CPU time; Middle: KKT residual ˜ǫKKT,j at convergence; Right: approximation error. By “init(random)”, it means that random initial guess (6.1) is used and similarly “init(HOSVD)” means initial guess by (6.2). The CPU time for “HOOI+init(HOSVD)” and “ASI+init(HOSVD)” does not include the time used for computing ini… view at source ↗
Figure 6.2
Figure 6.2. Figure 6.2: Convergence behavior in terms of normalized KKT residual ˜ǫKKT,j on the Mi￾randa tensor: core tensor size 4s × s × s for s = 4, 8, 20. It is interesting to notice that for s = 8, “ASI+init(HOSVD)” starts out with a much better initial guess than “ASI+init(random)”, but is overtaken by the latter at iteration 27 and ends up with taking 82 iterations while “ASI+init(random)” takes 122 iterations. 4s × s × … view at source ↗
Figure 6
Figure 6. Figure 6: displays the convergence history in terms of KKT r [PITH_FULL_IMAGE:figures/full_fig_p025_6.png] view at source ↗
Figure 6.3
Figure 6.3. Figure 6.3: Convergence in terms of KKT residual ˜ǫKKT,j as defined in (4.4) and objective value on real tensors with [n1, n2, n3] = [600, 550, 500] and complex tensors with [n1, n2, n3] = [480, 440, 400], and [k1, k2, k3] = [12, 11, 10], as η varies in {2 −3 , 2 −4 , 2 −5}. In each case: real or complex, the first row is for KKT residual ˜ǫKKT,j while the second row is for objective value. 29 [PITH_FULL_IMAGE:figu… view at source ↗
Figure 6.4
Figure 6.4. Figure 6.4: Scalability of HOOI and ASI on real tensors with [n1, n2, n3] varying according to (6.4) for 1 ≤ s ≤ 8 and [k1, k2, k3] = [12, 11, 10]. Left panel: KKT residual ˜ǫKKT, Middle panel: CPU time, and Right panel: the number of iterations. 30 [PITH_FULL_IMAGE:figures/full_fig_p030_6_4.png] view at source ↗
Figure 6.5
Figure 6.5. Figure 6.5: Scalability of HOOI and ASI on complex tensors with [n1, n2, n3] varying according to (6.4) for 1 ≤ s ≤ 8 and [k1, k2, k3] = [12, 11, 10]. Left panel: KKT residual ˜ǫKKT, Middle panel: CPU time, and Right panel: the number of iterations. 31 [PITH_FULL_IMAGE:figures/full_fig_p031_6_5.png] view at source ↗
read the original abstract

The higher-order orthogonal iteration (HOOI) and the alternating subspace iteration (ASI) are two popular numerical methods for computing the Tucker decomposition of a multiple-mode tensor. Xu [Linear and Multilinear Algebra, 66(11):2247--2265, 2018] proposed a variation of HOOI, called the greedy HOOI, which has an extra alignment action between consecutive approximations. Kroonenberg and De Leeuw [Psychometrika, 45(1):69--97, 1980] analyzed the convergence of ASI but their analysis has gaps. These analysis were for a real tensor only. In this paper, we present detailed convergence analysis of the two methods that is applicable to a complex tensor with a real tensor being a special case, and it is shown both methods are globally convergent to stationary points under mild conditions while the objective function monotonically increases. Numerical examples are presented to demonstrate the convergence behavior of the methods.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 4 minor

Summary. The manuscript analyzes the convergence of the higher-order orthogonal iteration (HOOI) and alternating subspace iteration (ASI) for computing the Tucker decomposition of complex tensors (with real tensors as a special case). It supplies detailed proofs that both methods converge globally to stationary points under explicitly stated mild conditions (boundedness of the iterates plus a non-degeneracy condition on the singular values of the mode-unfoldings), while the objective function increases monotonically; the proofs close gaps in the 1980 Kroonenberg–De Leeuw analysis of ASI and reduce the complex case to the real case via real/imaginary splitting. Numerical examples illustrate the observed convergence behavior.

Significance. If the proofs are correct, the work supplies a rigorous theoretical justification for two widely used alternating schemes in tensor decomposition, extending them reliably to complex data that arise in signal processing and quantum applications. The Lyapunov-style arguments and explicit handling of the complex case constitute a clear advance over prior incomplete analyses.

major comments (2)
  1. [Theorem 3.1] Theorem 3.1: the non-degeneracy assumption on the singular values of the mode-unfoldings is stated as part of the mild conditions, yet the manuscript does not discuss how restrictive this assumption is for generic or noisy tensors; a short remark on its genericity would strengthen the claim that the result applies broadly.
  2. [§4.2] §4.2, proof of Theorem 4.2: the Lyapunov function is introduced to establish monotonicity, but the precise expression and the verification that it is bounded below are only sketched; an expanded paragraph confirming that the argument carries over without additional hidden assumptions for the complex case would remove any remaining doubt.
minor comments (4)
  1. [Abstract] The abstract contains a minor grammatical issue: 'These analysis were' should read 'These analyses were'.
  2. [Throughout] Notation: the manuscript occasionally uses both 'X' and bold 'X' for the same tensor; a single consistent convention (e.g., calligraphic or bold) would improve readability.
  3. [Numerical examples] Numerical section: the tensor dimensions and noise levels used in the examples should be stated explicitly in the text rather than only in figure captions.
  4. [Introduction] A reference to the original Xu (2018) greedy-HOOI paper is present, but the manuscript would benefit from citing one or two recent works on complex Tucker decomposition for context.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the positive recommendation of minor revision and for the constructive comments. We address each major comment below and will incorporate revisions to strengthen the manuscript.

read point-by-point responses
  1. Referee: [Theorem 3.1] Theorem 3.1: the non-degeneracy assumption on the singular values of the mode-unfoldings is stated as part of the mild conditions, yet the manuscript does not discuss how restrictive this assumption is for generic or noisy tensors; a short remark on its genericity would strengthen the claim that the result applies broadly.

    Authors: We agree that a brief discussion of the assumption's genericity would be helpful. In the revised manuscript we will add a short remark after Theorem 3.1 noting that the non-degeneracy condition holds for generic tensors: the set of tensors for which any mode-unfolding has repeated singular values is a lower-dimensional algebraic variety and hence has Lebesgue measure zero. Consequently the assumption is satisfied almost everywhere, including for generic noisy tensors whose perturbations generically separate the singular values. revision: yes

  2. Referee: [§4.2] §4.2, proof of Theorem 4.2: the Lyapunov function is introduced to establish monotonicity, but the precise expression and the verification that it is bounded below are only sketched; an expanded paragraph confirming that the argument carries over without additional hidden assumptions for the complex case would remove any remaining doubt.

    Authors: We thank the referee for this observation. In the revised version we will expand the relevant paragraph in §4.2 to state the precise Lyapunov function explicitly and to verify that it is bounded from below. We will also add a short confirmation that the real/imaginary splitting reduces the complex case to the real case without introducing extra assumptions, since the monotonicity and lower-boundedness properties are preserved under this embedding. revision: yes

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper's central results are global convergence proofs for the HOOI and ASI algorithms applied to Tucker decomposition of complex tensors (with real tensors as a special case). These rest on explicit mild conditions stated in Theorems 3.1 and 4.2 (boundedness of the sequence and non-degeneracy of singular values of mode-unfoldings), a Lyapunov argument that establishes monotonic increase of the objective, and a real/imaginary splitting reduction from the complex to the real case. No derivation step reduces by construction to a fitted parameter, self-definition, or load-bearing self-citation whose validity is presupposed inside the paper; the proofs close documented gaps in prior real-tensor analyses (Kroonenberg-De Leeuw) using standard iterative-method techniques that are independently verifiable. The manuscript is therefore self-contained against external mathematical benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on standard mathematical properties of tensors and iterative optimization; no free parameters, invented entities, or ad-hoc axioms are identifiable from the abstract.

axioms (1)
  • domain assumption Mild conditions on the tensor and iteration suffice for global convergence
    The abstract states convergence holds under mild conditions but does not enumerate them.

pith-pipeline@v0.9.0 · 5692 in / 1181 out tokens · 46826 ms · 2026-05-20T01:11:39.271929+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

32 extracted references · 32 canonical work pages

  1. [1]

    Ballard and

    G. Ballard and . G. Kolda. Tensor Decompositions for Data Science . Cambridge University Press, 2025

  2. [2]

    Bolte, S

    J. Bolte, S. Sabach, and M. Teboulle. Proximal alternating lineariz ed minimization for non- convex and nonsmooth problems. Math. Program., 146(1):459–494, 2014

  3. [3]

    De Lathauwer, B

    L. De Lathauwer, B. De Moor, and J. Vandewalle. On the best ran k-1 and rank-( r1, r 2, . . . , r n) approximation of higher-order tensors. SIAM J. Matrix Anal. Appl. , 21(4):1324–1342, 2000

  4. [4]

    De Lathauwer, Bart De Moor, and Joos Vandewalle

    L. De Lathauwer, Bart De Moor, and Joos Vandewalle. A multilinear singular value decom- position. SIAM J. Matrix Anal. Appl. , 21(4):1253–1278, 2000

  5. [5]

    J. Demmel. Applied Numerical Linear Algebra . SIAM, Philadelphia, PA, 1997

  6. [6]

    D. A. d’Esopo. A convex programming procedure. Naval Research Logistics Quarterly , 6(1):33–42, 1959

  7. [7]

    L. Eld´ en. Matrix Methods in Data Mining and Pattern Recognition . SIAM, Philadelphia, 2007. 26

  8. [8]

    G. H. Golub and C. F. Van Loan. Matrix Computations . Johns Hopkins University Press, Baltimore, Maryland, 4th edition, 2013

  9. [9]

    Kanzow and H.-D

    C. Kanzow and H.-D. Qi. A QP-free constrained Newton-type met hod for variational inequal- ity problems. Math. Program., 85:81–106, 1999

  10. [10]

    H. A. L. Kiers and A. Der Kinderen. A fast method for choosing t he numbers of components in Tucker3 analysis. British J. Math. Stat. Psychology , 56(1):119–125, 2003

  11. [11]

    T. G. Kolda and B. W. Bader. Tensor decompositions and applicat ions. SIAM Rev., 51(3):455– 500, 2009

  12. [12]

    Kovaˇ c-Striko and K

    J. Kovaˇ c-Striko and K. Veseli´ c. Some remarks on the spect ra of Hermitian matrices. Linear Algebra Appl., 145:221–229, 1991

  13. [13]

    P. M. Kroonenberg and J. De Leeuw. Principal component analy sis of three-mode data by means of alternating least squares algorithms. Psychometrika, 45(1):69–97, 1980

  14. [14]

    R.-C. Li. A perturbation bound for the generalized polar decomp osition. BIT, 33:304–308, 1993

  15. [15]

    R.-C. Li. New perturbation bounds for the unitary polar factor . SIAM J. Matrix Anal. Appl. , 16:327–332, 1995

  16. [16]

    R.-C. Li. Accuracy of computed eigenvectors via optimizing a Ray leigh quotient. BIT, 44(3):585–593, 2004

  17. [17]

    R.-C. Li. Matrix perturbation theory. In L. Hogben, R. Brualdi, and G. W. Stewart, editors, Handbook of Linear Algebra, page Chapter 21. CRC Press, Boca Raton, FL, 2nd edition, 2014

  18. [18]

    R.-C. Li. Approximations of extremal eigenspace and orthonor mal polar factor. Linear Algebra Appl., 2026. Appeared online January 22, 2026

  19. [19]

    R.-C. Li. A theory of the NEPv approach for optimization on the S tiefel manifold. Found. Comput. Math. , 26:179–244, 2026

  20. [20]

    R.-C. Li, L. Wang, and M. Yang. An NPDo approach for tensor blo ck-diagonalization, August

  21. [21]

    Mor´ e and D

    J. Mor´ e and D. Sorensen. Computing a trust region step. SIAM J. Sci. Statist. Comput. , 4(3):553–572, 1983

  22. [22]

    G. W. Stewart. Matrix Algorithms, Vol. II: Eigensystems . SIAM, Philadelphia, 2001

  23. [23]

    G. W. Stewart and J.-G. Sun. Matrix Perturbation Theory . Academic Press, Boston, 1990

  24. [24]

    L. R. Tucker. Implications of factor analysis of three-way mat rices for measurement of change. In C. W. Harris, editor, Problems in Measuring Change , volume 15, pages 122–137. University of Wisconsin Press, 1963

  25. [25]

    L. R. Tucker. The extension of factor analysis to three-dimen sional matrices. In H. Gulliksen and N. Frederiksen, editors, Contributions to Mathematical Psychology , volume 110119, pages 110–182. Holt, Rinehart and Winston New York, 1964

  26. [26]

    L. R. Tucker. Some mathematical notes on three-mode facto r analysis. Psychometrika, 31(3):279–311, 1966

  27. [27]

    L. R. Tucker. Relations between multidimensional scaling and thr ee-mode factor analysis. Psychometrika, 37(1):3–27, 1972. 27

  28. [28]

    L. Wang, J. Wang, and R.-C. Li. Sparse PCA via matrix (21)-norm regularization with an application to feature selection. Resul. Appl. Math. , 28(100676):16 pages, 2025

  29. [29]

    Wang, L.-H

    L. Wang, L.-H. Zhang, and R.-C. Li. Maximizing sum of coupled trac es with applications. Numer. Math. , 152:587–629, 2022

  30. [30]

    H. F. Weinberger. Variational Methods for Eigenvalue Approximation , volume 15 of CBMS- NSF Regional Conference Series in Applied Mathematics . SIAM, Philadelphia, 1974

  31. [31]

    Y. Xu. On the convergence of higher-order orthogonal itera tion. Lin. Multilin. Alg. , 66(11):2247–2265, 2018

  32. [32]

    Zhang and G

    T. Zhang and G. H. Golub. Rank-one approximation to high order tensors. SIAM J. Matrix Anal. Appl. , 23(2):534–550, 2001. 28 η = 2 − 3 η = 2 − 4 η = 2 − 5 real 0 1000 2000 3000 4000 5000 6000 iteration j 10-8 10-7 10-6 10-5 10-4 KKT residual KKT =2-3 ASI HOOI 0 5 10 15 20 25 iteration j 10-10 10-9 10-8 10-7 10-6 10-5 10-4 10-3 KKT residual KKT =2-4 ASI HO...