pith. sign in

arxiv: 2304.03641 · v4 · pith:CNRWB77Inew · submitted 2023-04-07 · 🧮 math.OC · cs.LG· cs.NA· math.NA

A Block Coordinate Descent Method for Nonsmooth Composite Optimization under Orthogonality Constraints

Pith reviewed 2026-05-24 09:20 UTC · model grok-4.3

classification 🧮 math.OC cs.LGcs.NAmath.NA
keywords block coordinate descentorthogonality constraintsnonsmooth composite optimizationglobal block-k stationary pointsiteration complexityKurdyka-Lojasiewicz inequality
0
0 comments X

The pith

OBCD updates k rows at a time by exactly solving small nonsmooth subproblems to reach global block-k stationary points with O(1/ε) iteration complexity.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents OBCD, a block coordinate descent method for nonsmooth composite optimization under orthogonality constraints. In each step it selects k rows of the current matrix and replaces them by the global minimizer of a small subproblem that keeps the row block orthogonal. The authors prove that such row-wise updates are complete, meaning any feasible point is reachable from any other, and that the limit points satisfy a global block-k stationarity condition stronger than ordinary criticality. They further establish an iteration complexity of O(1/ε) to produce an ε-block-k stationary point and a non-ergodic rate under the Kurdyka–Lojasiewicz inequality.

Core claim

The central claim is that the OBCD scheme, by globally solving the nonsmooth subproblems that arise when k rows are updated under orthogonality constraints, produces limit points that are global block-k stationary points and reaches an ε-block-k stationary point after O(1/ε) iterations.

What carries the argument

The OBCD row-wise update that globally solves a small nonsmooth optimization problem under orthogonality constraints for any chosen block of k rows.

If this is right

  • Row-wise orthogonal updates can reach any feasible point from any feasible initialization.
  • Limit points of OBCD are global block-k stationary points that satisfy stronger optimality than standard critical points.
  • An ε-block-k stationary point is found after O(1/ε) iterations.
  • Under the Kurdyka–Lojasiewicz inequality a non-ergodic convergence rate holds.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If similar globally solvable subproblems exist for other matrix constraints, the same block-update completeness argument could apply.
  • The stronger stationarity notion may translate into better empirical performance on downstream statistical-learning tasks that rely on orthogonal matrices.
  • Breakpoint search techniques developed for the subproblems could be reused in other nonconvex composite problems that admit low-dimensional orthogonal blocks.

Load-bearing premise

The small nonsmooth optimization subproblems under orthogonality constraints that arise when updating k rows can be solved to global optimality using breakpoint search methods.

What would settle it

A concrete numerical instance in which breakpoint search returns a suboptimal subproblem solution and the overall algorithm fails to produce a global block-k stationary point, or an experiment showing that more than linear-in-1/ε iterations are required to reach an ε-block-k stationary point.

Figures

Figures reproduced from arXiv: 2304.03641 by Ganzhao Yuan.

Figure 1
Figure 1. Figure 1: The convergence curve of the compared methods for s [PITH_FULL_IMAGE:figures/full_fig_p010_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Geometric Visualizations of Two Examples of [PITH_FULL_IMAGE:figures/full_fig_p022_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: The convergence curve of the compared methods for s [PITH_FULL_IMAGE:figures/full_fig_p042_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: The convergence curve of the compared methods for s [PITH_FULL_IMAGE:figures/full_fig_p047_4.png] view at source ↗
read the original abstract

Nonsmooth composite optimization with orthogonality constraints has a wide range of applications in statistical learning and data science. However, this problem is challenging due to its nonsmooth objective and computationally expensive nonconvex constraints. In this paper, we propose a new approach called \textbf{OBCD}, which leverages block coordinate descent to address these challenges. \textbf{OBCD} is a feasible method with a small computational footprint. In each iteration, it updates \(k\) rows of the solution matrix, where \(k \geq 2\), by globally solving a small nonsmooth optimization problem under orthogonality constraints. We prove the completeness of the proposed update scheme, showing that row-wise orthogonal updates can reach any feasible point from any feasible initialization. We further prove that the limit points generated by \textbf{OBCD}, referred to as global block-\(k\) stationary points, offer stronger optimality than standard critical points. Furthermore, we show that \textbf{OBCD} finds an \(\epsilon\)-block-\(k\) stationary point with an iteration complexity of \(\mathcal{O}(1/\epsilon)\). Additionally, under the Kurdyka--Lojasiewicz (KL) inequality, we establish the non-ergodic convergence rate of \textbf{OBCD}. We also demonstrate how novel breakpoint search methods can be used to solve the subproblems arising in \textbf{OBCD}. Empirical results show that our approach consistently outperforms existing methods.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The paper proposes OBCD, a block coordinate descent algorithm for nonsmooth composite optimization subject to orthogonality constraints. In each iteration, k rows (k≥2) of the matrix variable are updated by globally solving a small nonsmooth subproblem under row-wise orthogonality constraints, using novel breakpoint search methods. The central claims are: (i) the row-wise orthogonal update scheme is complete, i.e., any feasible point is reachable from any feasible initialization; (ii) limit points are global block-k stationary points, a stronger notion than standard critical points; (iii) OBCD reaches an ε-block-k stationary point in O(1/ε) iterations; (iv) non-ergodic convergence rates hold under the KL inequality; and (v) the method empirically outperforms existing approaches on statistical learning tasks.

Significance. If the central claims hold, the work supplies a feasible, low-footprint method with explicit iteration complexity and a strictly stronger stationarity concept for a practically relevant class of orthogonality-constrained nonsmooth problems. The completeness result for the update scheme and the O(1/ε) bound to ε-block-k stationarity are notable strengths that would distinguish the contribution from standard BCD analyses.

major comments (2)
  1. [subproblem solver description (near the statement of the breakpoint search)] The section describing the breakpoint search methods for the k-row subproblems: the manuscript asserts that these methods solve each nonsmooth composite subproblem to global optimality, yet supplies no explicit argument that the enumerated breakpoints cover every candidate point or that the nonsmooth terms (ℓ1, indicators, etc.) cannot introduce additional stationary points missed by the search. Because global optimality of every block update is required for the descent property, the completeness theorem, the definition of global block-k stationarity, and the O(1/ε) complexity bound all rest on this unverified claim.
  2. [complexity theorem] Theorem establishing the O(1/ε) iteration complexity: the proof invokes that each subproblem is solved to global optimality to obtain a uniform descent amount; if the breakpoint procedure returns only a local solution on some instances, the descent lemma fails and the complexity guarantee does not hold.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive report. The two major comments correctly identify that the current manuscript asserts global optimality of the breakpoint search procedures for the k-row subproblems but does not supply an explicit, self-contained argument that the enumerated breakpoints exhaust all candidate minimizers and that the nonsmooth terms cannot create additional stationary points outside the search. We will address this gap by adding the required proof in the revised version; the remainder of the analysis then follows as stated.

read point-by-point responses
  1. Referee: [subproblem solver description (near the statement of the breakpoint search)] The section describing the breakpoint search methods for the k-row subproblems: the manuscript asserts that these methods solve each nonsmooth composite subproblem to global optimality, yet supplies no explicit argument that the enumerated breakpoints cover every candidate point or that the nonsmooth terms (ℓ1, indicators, etc.) cannot introduce additional stationary points missed by the search. Because global optimality of every block update is required for the descent property, the completeness theorem, the definition of global block-k stationarity, and the O(1/ε) complexity bound all rest on this unverified claim.

    Authors: We agree that an explicit verification is required. In the revised manuscript we will insert a dedicated subsection (immediately following the description of the breakpoint search) that proves: (i) every point at which the subdifferential of the nonsmooth composite objective can contain zero under the row-wise orthogonality constraint must coincide with one of the enumerated breakpoints or an endpoint of the feasible interval; (ii) the nonsmooth terms (ℓ1 norms and indicator functions) contribute only finitely many additional candidate points that are already captured by the breakpoint enumeration; and (iii) exhaustive evaluation at these points therefore yields the global minimizer. This argument will be written in a self-contained manner so that the descent property, completeness of the update scheme, global block-k stationarity, and complexity bound rest on a fully justified foundation. revision: yes

  2. Referee: [complexity theorem] Theorem establishing the O(1/ε) iteration complexity: the proof invokes that each subproblem is solved to global optimality to obtain a uniform descent amount; if the breakpoint procedure returns only a local solution on some instances, the descent lemma fails and the complexity guarantee does not hold.

    Authors: The complexity proof indeed relies on a uniform descent amount that is guaranteed only when each block subproblem is solved to global optimality. Once the explicit argument requested in the first comment is added, the descent lemma holds without qualification and the O(1/ε) bound follows exactly as written. No change to the statement or proof structure of the complexity theorem itself will be needed beyond referencing the new subsection on global optimality of the subproblem solver. revision: yes

Circularity Check

0 steps flagged

No significant circularity; theoretical results are independently derived

full rationale

The paper presents explicit proofs for the completeness of the row-wise orthogonal update scheme (reaching any feasible point), the stronger optimality of global block-k stationary points, and the O(1/ε) iteration complexity to ε-block-k stationarity. These are standard convergence arguments in block coordinate descent literature and do not reduce to fitted parameters, self-definitions, or self-citation chains. The breakpoint search method for subproblems is introduced as a separate algorithmic contribution rather than an unverified assumption that collapses the main claims. No load-bearing step matches any enumerated circularity pattern.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claims rest on the global solvability of the k-row subproblems and the applicability of the KL inequality for rates; no free parameters or invented entities are indicated in the abstract.

axioms (1)
  • domain assumption Kurdyka--Lojasiewicz (KL) inequality
    Invoked to establish the non-ergodic convergence rate of OBCD.

pith-pipeline@v0.9.0 · 5788 in / 1364 out tokens · 42791 ms · 2026-05-24T09:20:26.198368+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

58 extracted references · 58 canonical work pages

  1. [1]

    Advances in Neural Information Processing Systems (NeurIPS), 26, 2013

    Fantope projection and selection: A near-optimal convex relaxation of sparse pca. Advances in Neural Information Processing Systems (NeurIPS), 26, 2013

  2. [2]

    E., Eriksson, J., and Koivunen, V

    Abrudan, T. E., Eriksson, J., and Koivunen, V. Steepest descent algorithms for optimization under unitary matrix constraint. IEEE Transactions on Signal Processing, 56 0 (3): 0 1134--1147, 2008

  3. [3]

    E., and Sepulchre, R

    Absil, P., Mahony, R. E., and Sepulchre, R. Optimization Algorithms on Matrix Manifolds. Princeton University Press, 2008

  4. [4]

    Proximal alternating minimization and projection methods for nonconvex problems: An approach based on the kurdyka-lojasiewicz inequality

    Attouch, H., Bolte, J., Redont, P., and Soubeyran, A. Proximal alternating minimization and projection methods for nonconvex problems: An approach based on the kurdyka-lojasiewicz inequality. Mathematics of Operations Research, 35 0 (2): 0 438--457, 2010

  5. [5]

    Deep layers as stochastic solvers

    Bibi, A., Ghanem, B., Koltun, V., and Ranftl, R. Deep layers as stochastic solvers. In International Conference on Learning Representations (ICLR). OpenReview.net, 2019

  6. [6]

    Proximal alternating linearized minimization for nonconvex and nonsmooth problems

    Bolte, J., Sabach, S., and Teboulle, M. Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Mathematical Programming, 146 0 (1-2): 0 459--494, 2014

  7. [7]

    and Absil, P.-a

    Boumal, N. and Absil, P.-a. Rtrmc: A riemannian trust-region method for low-rank matrix completion. Advances in Neural Information Processing Systems (NeurIPS), 24, 2011

  8. [8]

    Proximal gradient method for nonsmooth optimization over the stiefel manifold

    Chen, S., Ma, S., Man-Cho So, A., and Zhang, T. Proximal gradient method for nonsmooth optimization over the stiefel manifold. SIAM Journal on Optimization, 30 0 (1): 0 210--239, 2020

  9. [9]

    An augmented lagrangian method for _1 -regularized optimization problems with orthogonality constraints

    Chen, W., Ji, H., and You, Y. An augmented lagrangian method for _1 -regularized optimization problems with orthogonality constraints. SIAM Journal on Scientific Computing, 38 0 (4): 0 B570--B592, 2016

  10. [10]

    Optimal solutions for sparse principal component analysis

    d'Aspremont, A., Bach, F., and El Ghaoui, L. Optimal solutions for sparse principal component analysis. Journal of Machine Learning Research, 9 0 (7), 2008

  11. [11]

    A., and Smith, S

    Edelman, A., Arias, T. A., and Smith, S. T. The geometry of algorithms with orthogonality constraints. SIAM Journal on Matrix Analysis and Applications, 20 0 (2): 0 303--353, 1998

  12. [12]

    and Bruna, J

    Frerix, T. and Bruna, J. Approximating orthogonal matrices with effective givens factorization. In International Conference on Machine Learning (ICML), pp.\ 1993--2001, 2019

  13. [13]

    A new first-order algorithmic framework for optimization problems with orthogonality constraints

    Gao, B., Liu, X., Chen, X., and Yuan, Y.-x. A new first-order algorithmic framework for optimization problems with orthogonality constraints. SIAM Journal on Optimization, 28 0 (1): 0 302--332, 2018

  14. [14]

    Parallelizable algorithms for optimization problems with orthogonality constraints

    Gao, B., Liu, X., and Yuan, Y.-x. Parallelizable algorithms for optimization problems with orthogonality constraints. SIAM Journal on Scientific Computing, 41 0 (3): 0 A1949--A1983, 2019

  15. [15]

    Golub, G. H. and Van Loan, C. F. Matrix computations. 2013

  16. [16]

    and Yuan, X

    He, B. and Yuan, X. On the O (1/n) convergence rate of the douglascrachford alternating direction method. SIAM Journal on Numerical Analysis, 50 0 (2): 0 700--709, 2012

  17. [17]

    Adaptive quadratically regularized newton method for riemannian optimization

    Hu, J., Milzarek, A., Wen, Z., and Yuan, Y. Adaptive quadratically regularized newton method for riemannian optimization. SIAM Journal on Matrix Analysis and Applications, 39 0 (3): 0 1181--1207, 2018

  18. [18]

    Structured quasi-newton methods for optimization with orthogonality constraints

    Hu, J., Jiang, B., Lin, L., Wen, Z., and Yuan, Y.-x. Structured quasi-newton methods for optimization with orthogonality constraints. SIAM Journal on Scientific Computing, 41 0 (4): 0 A2239--A2269, 2019

  19. [19]

    A brief introduction to manifold optimization

    Hu, J., Liu, X., Wen, Z.-W., and Yuan, Y.-X. A brief introduction to manifold optimization. Journal of the Operations Research Society of China, 8 0 (2): 0 199--248, 2020

  20. [20]

    J., Collins, M

    Hwang, S. J., Collins, M. D., Ravi, S. N., Ithapu, V. K., Adluru, N., Johnson, S. C., and Singh, V. A projection free method for generalized eigenvalue problem with a nonsmooth regularizer. In IEEE International Conference on Computer Vision (ICCV), pp.\ 1841--1849, 2015

  21. [21]

    and Dai, Y.-H

    Jiang, B. and Dai, Y.-H. A framework of constraint preserving update schemes for optimization on stiefel manifold. Mathematical Programming, 153 0 (2): 0 535--575, 2015

  22. [22]

    _p -norm regularization algorithms for optimization over permutation matrices

    Jiang, B., Liu, Y.-F., and Wen, Z. _p -norm regularization algorithms for optimization over permutation matrices. SIAM Journal on Optimization, 26 0 (4): 0 2284--2313, 2016

  23. [23]

    An exact penalty approach for optimization with nonnegative orthogonality constraints

    Jiang, B., Meng, X., Wen, Z., and Chen, X. An exact penalty approach for optimization with nonnegative orthogonality constraints. Mathematical Programming, pp.\ 1--43, 2022

  24. [24]

    Generalized power method for sparse principal component analysis

    Journ \'e e, M., Nesterov, Y., Richt \'a rik, P., and Sepulchre, R. Generalized power method for sparse principal component analysis. Journal of Machine Learning Research, 11 0 (2), 2010

  25. [25]

    and Osher, S

    Lai, R. and Osher, S. A splitting method for orthogonality constrained problems. Journal of Scientific Computing, 58 0 (2): 0 431--449, 2014

  26. [26]

    Weakly convex optimization over stiefel manifold using riemannian subgradient-type methods

    Li, X., Chen, S., Deng, Z., Qu, Q., Zhu, Z., and Man-Cho So, A. Weakly convex optimization over stiefel manifold using riemannian subgradient-type methods. SIAM Journal on Optimization, 31 0 (3): 0 1605--1634, 2021

  27. [27]

    Liu, H., Wu, W., and So, A. M.-C. Quadratic optimization with orthogonality constraints: Explicit lojasiewicz exponent and linear convergence of line-search methods. In International Conference on Machine Learning (ICML), pp.\ 1158--1167, 2016

  28. [28]

    On the estimation performance and convergence rate of the generalized power method for phase synchronization

    Liu, H., Yue, M.-C., and Man-Cho So, A. On the estimation performance and convergence rate of the generalized power method for phase synchronization. SIAM Journal on Optimization, 27 0 (4): 0 2426--2446, 2017

  29. [29]

    A class of smooth exact penalty function methods for optimization problems with orthogonality constraints

    Liu, W., Zhang, Y., Yang, H., and Zhang, S. A class of smooth exact penalty function methods for optimization problems with orthogonality constraints. Optimization, 69 0 (3): 0 399--426, 2020

  30. [30]

    On the convergence of the self-consistent field iteration in kohn--sham density functional theory

    Liu, X., Wang, X., Wen, Z., and Yuan, Y. On the convergence of the self-consistent field iteration in kohn--sham density functional theory. SIAM Journal on Matrix Analysis and Applications, 35 0 (2): 0 546--558, 2014

  31. [31]

    An efficient gauss--newton algorithm for symmetric low-rank product matrix approximations

    Liu, X., Wen, Z., and Zhang, Y. An efficient gauss--newton algorithm for symmetric low-rank product matrix approximations. SIAM Journal on Optimization, 25 0 (3): 0 1571--1608, 2015

  32. [32]

    and Zhang, Y

    Lu, Z. and Zhang, Y. An augmented lagrangian approach for sparse principal component analysis. Mathematical Programming, 135 0 (1-2): 0 149--193, 2012

  33. [33]

    Optimization with first-order surrogate functions

    Mairal, J. Optimization with first-order surrogate functions. In International Conference on Machine Learning (ICML), volume 28, pp.\ 783--791, 2013

  34. [34]

    and Abrol, V

    Massart, E. and Abrol, V. Coordinate descent on the orthogonal group for recurrent neural network training. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), volume 36, pp.\ 7744--7751, 2022

  35. [35]

    Mordukhovich, B. S. Variational analysis and generalized differentiation i: Basic theory. Berlin Springer, 330, 2006

  36. [36]

    Coordinate descent method for k-means

    Nie, F., Xue, J., Wu, D., Wang, R., Li, H., and Li, X. Coordinate descent method for k-means. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44 0 (5): 0 2371--2385, 2022

  37. [37]

    Exact penalty methods for minimizing a smooth function over the nonnegative orthogonal set

    Qian, Y., Pan, S., and Xiao, L. Exact penalty methods for minimizing a smooth function over the nonnegative orthogonal set. arXiv, 11 2021

  38. [38]

    A unified convergence analysis of block successive minimization methods for nonsmooth optimization

    Razaviyayn, M., Hong, M., and Luo, Z. A unified convergence analysis of block successive minimization methods for nonsmooth optimization. SIAM Journal on Optimization, 23 0 (2): 0 1126--1153, 2013

  39. [39]

    Rockafellar, R. T. and Wets., R. J.-B. Variational analysis. Springer Science & Business Media, 317, 2009

  40. [40]

    and Chechik, G

    Shalit, U. and Chechik, G. Coordinate-descent for learning orthogonal matrices through givens rotations. In International Conference on Machine Learning (ICML), pp.\ 548--556. PMLR, 2014

  41. [41]

    and Bischof, C

    Sun, X. and Bischof, C. A basis-kernel representation of orthogonal matrices. SIAM Journal on Matrix Analysis and Applications, 16 0 (4): 0 1184--1196, 1995

  42. [42]

    and Yun, S

    Tseng, P. and Yun, S. A coordinate gradient descent method for nonsmooth separable minimization. Mathematical Programming, 117 0 (1-2): 0 387--423, 2009

  43. [43]

    Q., Cho, J., Lei, J., and Rohe, K

    Vu, V. Q., Cho, J., Lei, J., and Rohe, K. Fantope projection and selection: A near-optimal convex relaxation of sparse pca. Advances in Neural Information Processing Systems (NeurIPS), 26, 2013

  44. [44]

    and Yin, W

    Wen, Z. and Yin, W. A feasible method for optimization with orthogonality constraints. Mathematical Programming, 142 0 (1-2): 0 397, 2013

  45. [45]

    Trace-penalty minimization for large-scale eigenspace computation

    Wen, Z., Yang, C., Liu, X., and Zhang, Y. Trace-penalty minimization for large-scale eigenspace computation. Journal of Scientific Computing, 66 0 (3): 0 1175--1203, 2016

  46. [46]

    Quartic equation: https://en.wikipedia.org/wiki/Quartic_equation

    WikiContributors. Quartic equation: https://en.wikipedia.org/wiki/Quartic_equation. pp.\ Last edited in March, 2023

  47. [47]

    and Yin, W

    Xu, Y. and Yin, W. A block coordinate descent method for regularized multiconvex optimization with applications to nonnegative tensor factorization and completion. SIAM Journal on Imaging Sciences, 6 0 (3): 0 1758--1789, 2013

  48. [48]

    Coordinate descent methods for fractional minimization

    Yuan, G. Coordinate descent methods for fractional minimization. arXiv, 2022

  49. [49]

    Coordinate descent methods for dc minimization: Optimality conditions and global convergence

    Yuan, G. Coordinate descent methods for dc minimization: Optimality conditions and global convergence. Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2023

  50. [50]

    A block decomposition algorithm for sparse optimization

    Yuan, G., Shen, L., and Zheng, W.-S. A block decomposition algorithm for sparse optimization. In ACM SIGKDD Conference on Knowledge Discovery and Data Mining (SIGKDD), 2020

  51. [51]

    and Shashua, A

    Zass, R. and Shashua, A. Nonnegative sparse PCA . Advances in Neural Information Processing Systems (NeurIPS), pp.\ 1561--1568, 2006 a

  52. [52]

    and Shashua, A

    Zass, R. and Shashua, A. Nonnegative sparse pca. Advances in Neural Information Processing Systems (NeurIPS), 19, 2006 b

  53. [53]

    T.-K., Lin, S., and Yao, Y

    Zeng, J., Lau, T. T.-K., Lin, S., and Yao, Y. Global convergence of block coordinate descent in deep learning. In International Conference on Machine Learning (ICML), pp.\ 7313--7323. PMLR, 2019

  54. [54]

    Complete dictionary learning via l4-norm maximization over the orthogonal group

    Zhai, Y., Yang, Z., Liao, Z., Wright, J., and Ma, Y. Complete dictionary learning via l4-norm maximization over the orthogonal group. Journal of Machine Learning Research, 21 0 (165): 0 1--68, 2020

  55. [55]

    Primal-dual optimization algorithms over riemannian manifolds: an iteration complexity analysis

    Zhang, J., Ma, S., and Zhang, S. Primal-dual optimization algorithms over riemannian manifolds: an iteration complexity analysis. Mathematical Programming, pp.\ 1--46, 2019

  56. [56]

    Gradient type optimization methods for electronic structure calculations

    Zhang, X., Zhu, J., Wen, Z., and Zhou, A. Gradient type optimization methods for electronic structure calculations. SIAM Journal on Scientific Computing, 36 0 (3): 0 265--289, 2014

  57. [57]

    A riemannian newton algorithm for nonlinear eigenvalue problems

    Zhao, Z., Bai, Z.-J., and Jin, X.-Q. A riemannian newton algorithm for nonlinear eigenvalue problems. SIAM Journal on Matrix Analysis and Applications, 36 0 (2): 0 752--774, 2015

  58. [58]

    write newline

    " write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION format.date year duplicate empty "emp...