On solving symmetric multi-type orthogonal non-negative matrix tri-factorization problem

Andrej Kastrin; Gregor Papa; Janez Povh; Rok Hribar

arxiv: 2606.08291 · v1 · pith:BOW2UWSEnew · submitted 2026-06-06 · 💻 cs.LG

On solving symmetric multi-type orthogonal non-negative matrix tri-factorization problem

Rok Hribar , Gregor Papa , Janez Povh , Andrej Kastrin This is my paper

Pith reviewed 2026-06-27 19:48 UTC · model grok-4.3

classification 💻 cs.LG

keywords non-negative matrix tri-factorizationorthogonal constraintssymmetric matricesnetwork embeddingclusteringlink predictionheuristic optimization

0 comments

The pith

Two heuristic algorithms find high-quality solutions for the symmetric multi-type orthogonal non-negative matrix tri-factorization problem.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces two methods to solve a non-convex optimization problem that approximates several symmetric non-negative matrices simultaneously using a shared non-negative orthogonal factor G multiplied by type-specific factors. This shared structure is intended to produce consistent latent representations useful for clustering and analyzing networks. The first method uses a fixed-point iteration based on KKT conditions with a penalty for orthogonality, and the second uses a staged ADAM optimizer that enforces non-negativity and orthogonality in phases. Tests on synthetic data with noise and on real citation networks show the methods recover factors near the optimum and yield embeddings that perform as well as or better than SVD, node2vec, and other heuristics in link prediction, node clustering, and classification.

Core claim

The two proposed algorithms, a fixed-point method derived from penalized KKT conditions and a three-stage ADAM method, reliably compute high-quality local solutions to the symmetric multi-type orthogonal non-negative matrix tri-factorization problem, recovering factorizations close to the optimum on synthetic data and producing competitive embeddings on real networks.

What carries the argument

The shared non-negative orthogonal matrix G that defines the common structure in the approximations G S_i G^T for each input matrix.

If this is right

The learned G provides an assignment-type structure suitable for node clustering in networks.
The embeddings perform competitively in link prediction and node classification tasks on citation networks.
Both methods remain stable when input matrices contain noise.
The factorization improves interpretability over standard methods due to non-negativity and orthogonality.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

This joint factorization could be adapted for other multi-view network data beyond citation networks.
The orthogonality constraint might be relaxed in future work to handle overlapping communities.
Since the methods are heuristics, combining them with global optimization techniques could further improve solution quality.

Load-bearing premise

The non-convex problem has high-quality local solutions that the fixed-point and ADAM heuristics can consistently locate.

What would settle it

On synthetic instances where the true generating G and S_i are known, the algorithms fail to recover factors with small reconstruction error or large deviation from the ground truth.

read the original abstract

We study the symmetric multi-type orthogonal non-negative matrix tri-factorization problem, where several symmetric non-negative matrices are simultaneously approximated by factors of the form $GS_{i}G^{\top}$, with a shared non-negative and orthogonal factor $G$. This model is motivated by clustering and network analysis, where non-negativity improves interpretability and orthogonality gives a natural assignment-type structure to the latent factor. Since the resulting optimization problem is highly non-convex, we develop two heuristic algorithms for computing high-quality local solutions. The first one is a fixed point method derived from the Karush-Kuhn-Tucker conditions after adding a penalty term for the orthogonality constraint. The second one is a three-stage ADAM-based method that combines non-negativity-preserving optimization, orthogonalization, and restricted ADAM refinement on the feasible set. We evaluate both methods on synthetic data, including noisy instances, and on citation network benchmarks. The synthetic experiments show that both algorithms recover factorizations close to the optimum and remain stable under noise. On real networks, the learned embeddings are competitive with or better than standard baselines such as SVD, node2vec, and classical link prediction heuristics in link prediction, node clustering, and node classification tasks.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Two new heuristics for this exact symmetric multi-type orthogonal NMF tri-factorization, but the experiments are described at too high a level to judge their strength.

read the letter

The main takeaway is that the paper develops two specific heuristics for the symmetric multi-type orthogonal non-negative matrix tri-factorization problem: a penalized fixed-point iteration from the KKT conditions and a three-stage ADAM procedure that handles non-negativity, orthogonalization, and feasible-set refinement in sequence. These are the concrete new pieces, aimed at cases where multiple symmetric non-negative matrices share an orthogonal non-negative factor G.

The work does a reasonable job motivating the setup for network clustering and embedding, where orthogonality produces assignment-like structure and non-negativity helps interpretability. The synthetic tests directly check recovery of known ground-truth factors, including under added noise, which is a sensible way to probe whether the heuristics locate decent local solutions rather than poor minima. The real-data section compares the resulting embeddings against SVD, node2vec, and basic link-prediction rules on citation networks for link prediction, clustering, and classification.

The soft spot is the evaluation. Claims of recovering factorizations close to optimum and remaining stable are stated without numbers, error bars, baseline tables, or ablation results on the penalty coefficient or stage thresholds. This makes it difficult to gauge how large any gains are or how sensitive the methods are to the free parameters. The non-convexity is handled by heuristics whose behavior is checked empirically, which is fair, but the lack of quantitative detail leaves the practical reliability unclear.

This is for people already working on matrix factorization variants for graphs or multi-view networks. A reader in that narrow area could get value from the algorithmic descriptions and might want to implement the methods for their own data.

The paper shows clear engagement with the optimization and the relevant literature, so it is worth a serious referee even if the experiments need more detail in revision. I would send it to peer review rather than desk reject.

Referee Report

2 major / 2 minor

Summary. The paper studies the symmetric multi-type orthogonal non-negative matrix tri-factorization problem, in which multiple symmetric non-negative matrices are approximated simultaneously as GS_i G^T using a shared non-negative orthogonal factor G. Motivated by clustering and network analysis, it develops two heuristics for this non-convex problem: a fixed-point iteration derived from penalized KKT conditions, and a three-stage ADAM optimizer that enforces non-negativity then orthogonality then refines on the feasible set. Synthetic experiments (including noisy cases) are claimed to show recovery close to the optimum and stability; real-data experiments on citation networks claim competitive or superior performance versus SVD, node2vec, and classical heuristics on link prediction, node clustering, and node classification.

Significance. If the empirical claims hold, the work supplies practical, interpretable methods for a structured NMF variant useful in network embedding. A strength is the direct use of synthetic ground-truth recovery experiments to test whether the heuristics locate high-quality local minima rather than poor ones, together with multi-task real-data benchmarks against established baselines. This addresses the acknowledged non-convexity via concrete, evaluable procedures rather than theoretical guarantees alone.

major comments (2)

[§4] §4 (synthetic experiments): the central claim that both algorithms 'recover factorizations close to the optimum' is load-bearing for the motivation of the heuristics, yet the manuscript reports no quantitative recovery metrics (e.g., relative Frobenius error to ground-truth factors, success rates across random initializations) or error bars. Without these numbers it is impossible to judge how close 'close' is or whether the methods reliably avoid poor minima.
[§4] §4 (real-data experiments): the claim of competitiveness or superiority on link prediction, clustering, and classification rests on comparisons to SVD, node2vec, and heuristics, but the text provides no details on hyper-parameter tuning protocols, number of runs, or statistical significance tests. This weakens the downstream-performance part of the central empirical argument.

minor comments (2)

[§3] The orthogonality penalty coefficient and ADAM stage-transition thresholds are free parameters; their selection procedure should be stated explicitly in §3.
[§2] Notation for the multi-type matrices S_i and the shared G should be introduced once with a clear table or diagram rather than piecemeal.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which highlight opportunities to strengthen the empirical sections. We address each major point below and will revise the manuscript to incorporate the requested details and metrics.

read point-by-point responses

Referee: §4 (synthetic experiments): the central claim that both algorithms 'recover factorizations close to the optimum' is load-bearing for the motivation of the heuristics, yet the manuscript reports no quantitative recovery metrics (e.g., relative Frobenius error to ground-truth factors, success rates across random initializations) or error bars. Without these numbers it is impossible to judge how close 'close' is or whether the methods reliably avoid poor minima.

Authors: We agree that quantitative metrics are necessary to make the recovery claims precise and verifiable. In the revision we will add tables and plots reporting relative Frobenius error to the ground-truth factors, success rates (fraction of runs recovering within a tolerance) across random initializations, and error bars (standard deviation over runs) for both the noiseless and noisy synthetic regimes. revision: yes
Referee: §4 (real-data experiments): the claim of competitiveness or superiority on link prediction, clustering, and classification rests on comparisons to SVD, node2vec, and heuristics, but the text provides no details on hyper-parameter tuning protocols, number of runs, or statistical significance tests. This weakens the downstream-performance part of the central empirical argument.

Authors: We acknowledge that the current manuscript lacks sufficient experimental-protocol transparency. The revised version will specify the hyper-parameter search ranges and selection criteria, state the number of independent runs performed for each method and dataset, and report statistical significance (e.g., paired t-tests or Wilcoxon signed-rank tests with p-values) for the performance differences on link prediction, clustering, and classification tasks. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper's core methods are derived from standard KKT conditions (with an added penalty) and off-the-shelf ADAM optimization applied in stages; neither reduces to a fitted parameter or self-citation that defines the claimed performance. Synthetic recovery experiments directly measure distance to known ground-truth factors, and real-data results compare against external baselines (SVD, node2vec, heuristics) on independent tasks. No equation or premise is equivalent to its inputs by construction, and the non-convexity handling is evaluated empirically rather than assumed via internal redefinition.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 0 invented entities

The work rests on standard non-convex optimization assumptions and the existence of usable local minima; no new physical or mathematical entities are postulated.

free parameters (2)

orthogonality penalty coefficient
Introduced to convert the hard orthogonality constraint into a penalized objective for the fixed-point derivation.
ADAM learning rates and stage transition thresholds
Chosen to control the three-stage procedure; values not specified in abstract.

axioms (2)

standard math KKT optimality conditions apply to the penalized problem
Invoked to derive the fixed-point iteration.
domain assumption Non-negativity and orthogonality can be approximately enforced via staged projection and refinement
Underlies the three-stage ADAM method.

pith-pipeline@v0.9.1-grok · 5751 in / 1270 out tokens · 22747 ms · 2026-06-27T19:48:54.644332+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

42 extracted references · 30 canonical work pages

[1]

J Optimiz Theory App190, 234–258 (2021) https://doi.org/10.1007/s10957-021-01880-5

Ahookhosh, M., Hien, L.T.K., Gillis, N., Patrinos, P.: A block inertial Bregman proximal algorithm for nonsmooth nonconvex problems with application to symmetric nonnegative matrix tri-factorization. J Optimiz Theory App190, 234–258 (2021) https://doi.org/10.1007/s10957-021-01880-5

work page doi:10.1007/s10957-021-01880-5 2021
[2]

Mathematics9(5), 540 (2021)

Asadi, S., Povh, J.: A block coordinate descent-based projected gradient algorithm for orthogonal non- negative matrix factorization. Mathematics9(5), 540 (2021)

2021
[3]

Adv Data Anal Classi13, 825–853 (2019) https://doi.org/10.1007/s11634-018-0348-8

Abe, H., Yadohisa, H.: Orthogonal nonnegative matrix tri-factorization based on Tweedie distributions. Adv Data Anal Classi13, 825–853 (2019) https://doi.org/10.1007/s11634-018-0348-8

work page doi:10.1007/s11634-018-0348-8 2019
[4]

Athena Scientific, Natick, MA (2016) 24

Bertsekas, D.P.: Nonlinear Programming. Athena Scientific, Natick, MA (2016) 24

2016
[5]

IEEE T Knowl Data En35(5), 5132–5146 (2022) https://doi.org/10.1109/ TKDE.2022.3145489

Chen, Y., Lei, Z., Rao, Y., Xie, H., Wang, F.L., Yin, J., Li, Q.: Parallel non-negative matrix tri-factorization for text data co-clustering. IEEE T Knowl Data En35(5), 5132–5146 (2022) https://doi.org/10.1109/ TKDE.2022.3145489

arXiv 2022
[6]

John Wiley & Sons, Hoboken, NJ (2009) ˇCopar, A., ˇZitnik, M., Zupan, B.: Scalable non-negative matrix tri-factorization

Cichocki, A., Zdunek, R., Phan, A.H., Amari, S.-i.: Nonnegative Matrix and Tensor Factorizations: Applica- tions to Exploratory Multi-way Data Analysis and Blind Source Separation. John Wiley & Sons, Hoboken, NJ (2009) ˇCopar, A., ˇZitnik, M., Zupan, B.: Scalable non-negative matrix tri-factorization. Biodata Min10, 41 (2017) https://doi.org/10.1186/s1304...

work page doi:10.1186/s13040-017-0160-6 2009
[7]

Found Comput Math20, 119–154 (2020) https://doi.org/10.1007/s10208-018-09409-5

Davis, D., Drusvyatskiy, D., Kakade, S., Lee, J.D.: Stochastic subgradient method converges on tame functions. Found Comput Math20, 119–154 (2020) https://doi.org/10.1007/s10208-018-09409-5

work page doi:10.1007/s10208-018-09409-5 2020
[8]

In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp

Ding, C., Li, T., Peng, W., Park, H.: Orthogonal nonnegative matrix t-factorizations for clustering. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 126–135. Association for Computing Machinery, Philadelphia, PA (2006). https://doi.org/10. 1145/1150402.1150420

arXiv 2006
[9]

Algorithms12(10) (2019) https://doi.org/10.3390/a12100216

Favati, P., Lotti, G., Menchi, O., Romani, F.: Adaptive clustering via symmetric nonnegative matrix factorization of the similarity matrix. Algorithms12(10) (2019) https://doi.org/10.3390/a12100216

work page doi:10.3390/a12100216 2019
[10]

Society for Industrial and Applied Mathematics, Philadelphia, PA (2020)

Gillis, N.: Nonnegative Matrix Factorization. Society for Industrial and Applied Mathematics, Philadelphia, PA (2020)

2020
[11]

KDD - node2vec: Scalable Feature Learning for Networks,

Grover, A., Leskovec, J.: node2vec: Scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 855–864. Association for Computing Machinery, San Francisco, CA (2016). https://doi.org/10.1145/2939672.2939754 Gligorijevi´ c, V., Malod-Dognin, N., Prˇ zulj, N.: Fuse: Mul...

work page doi:10.1145/2939672.2939754 2016
[12]

In: Proceedings of the 2016 IEEE International Con- ference on Acoustics, Speech and Signal Processing, pp

Hajinezhad, D., Chang, T.-H., Wang, X., Shi, Q., Hong, M.: Nonnegative matrix factorization using ADMM: Algorithm and convergence analysis. In: Proceedings of the 2016 IEEE International Con- ference on Acoustics, Speech and Signal Processing, pp. 4742–4746. IEEE, Shanghai, China (2016). https://doi.org/10.1109/ICASSP.2016.7472577

work page doi:10.1109/icassp.2016.7472577 2016
[13]

J Global Optim82(2), 283–312 (2022) https://doi.org/10.1007/s10898-021-01074-3

Hribar, R., Hrga, T., Papa, G., Petelin, G., Povh, J., Prˇ zulj, N., Vukaˇ sinovi´ c, V.: Four algorithms to solve symmetric multi-type non-negative matrix tri-factorization problem. J Global Optim82(2), 283–312 (2022) https://doi.org/10.1007/s10898-021-01074-3

work page doi:10.1007/s10898-021-01074-3 2022
[14]

SIAM Rev53(2), 217–288 (2011) https://doi.org/10

Halko, N., Martinsson, P.-G., Tropp, J.A.: Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions. SIAM Rev53(2), 217–288 (2011) https://doi.org/10. 1137/090771806

2011
[15]

https://www.cs.toronto.edu/∼tijmen/csc321/slides/lecture slides lec6.pdf (2012)

Hinton, G., Srivastava, N., Swersky, K.: Neural networks for machine learning (Lecture 6a): Overview of mini- batch gradient descent. https://www.cs.toronto.edu/∼tijmen/csc321/slides/lecture slides lec6.pdf (2012)

2012
[16]

Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization (2014) arXiv:1412.6980 [cs.LG]

Pith/arXiv arXiv 2014
[17]

SIAM J Matrix Anal A30(2), 713–730 (2008) https://doi.org/10.1137/ 07069239X 25

Kim, H., Park, H.: Nonnegative matrix factorization based on alternating nonnegativity constrained least squares and active set method. SIAM J Matrix Anal A30(2), 713–730 (2008) https://doi.org/10.1137/ 07069239X 25

2008
[18]

Method Inform Med55(4), 340–346 (2016) https://doi.org/10.3414/ ME15-01-0108

Kastrin, A., Rindflesch, T.C., Hristovski, D.: Link prediction on a network of co-occurring MeSH terms: Towards literature-based discovery. Method Inform Med55(4), 340–346 (2016) https://doi.org/10.3414/ ME15-01-0108

2016
[19]

Physica A553, 124289 (2020) https://doi.org/10.1016/j.physa.2020.124289

Kumar, A., Singh, S.S., Singh, K., Biswas, B.: Link prediction techniques, applications, and performance: A survey. Physica A553, 124289 (2020) https://doi.org/10.1016/j.physa.2020.124289

work page doi:10.1016/j.physa.2020.124289 2020
[20]

Knowl-Based Syst283, 111192 (2024) https://doi.org/10.1016/j.knosys

Kong, Q., Sun, J., Xu, Z.: Joint orthogonal symmetric non-negative matrix factorization for community detection in attribute network. Knowl-Based Syst283, 111192 (2024) https://doi.org/10.1016/j.knosys. 2023.111192

work page doi:10.1016/j.knosys 2024
[21]

Bioinformatics 34(4), 652–659 (2018) https://doi.org/10.1093/bioinformatics/btx613

Lever, J., Gakkhar, S., Gottlieb, M., Rashnavadi, T., Lin, S., Siu, C., Smith, M., Jones, M.R., Krzywinski, M., Jones, S.J.: A collaborative filtering-based approach to biomedical knowledge discovery. Bioinformatics 34(4), 652–659 (2018) https://doi.org/10.1093/bioinformatics/btx613

work page doi:10.1093/bioinformatics/btx613 2018
[22]

Neural Computation 19(10), 2756–2779 (2007)

Lin, C.-J.: Projected gradient methods for nonnegative matrix factorization. Neural Comput19(10), 2756– 2779 (2007) https://doi.org/10.1162/neco.2007.19.10.2756

work page doi:10.1162/neco.2007.19.10.2756 2007
[23]

IEEE T Neur Net Lear33(3), 1203–1215 (2022) https: //doi.org/10.1109/TNNLS.2020.3041360

Luo, X., Liu, Z., Jin, L., Zhou, Y., Zhou, M.: Symmetric nonnegative matrix factorization-based community detection models and their convergence analysis. IEEE T Neur Net Lear33(3), 1203–1215 (2022) https: //doi.org/10.1109/TNNLS.2020.3041360

work page doi:10.1109/tnnls.2020.3041360 2022
[24]

J Am Soc Inf Sci Tec58(7), 1019–1031 (2007) https://doi.org/10.1002/asi.20591

Liben-Nowell, D., Kleinberg, J.: The link-prediction problem for social networks. J Am Soc Inf Sci Tec58(7), 1019–1031 (2007) https://doi.org/10.1002/asi.20591

work page doi:10.1002/asi.20591 2007
[25]

In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, pp

Liu, K., Wang, H.: Robust multi-relational clustering viaℓ 1-norm symmetric nonnegative matrix factoriza- tion. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, pp. 397–401. Association for Computational Linguistics, Beijing, China (2015)....

work page doi:10.3115/v1/p15-2065 2015
[26]

https://doi.org/10.24963/ijcai.2018/340

Stockholm, Sweden (2018). https://doi.org/10.24963/ijcai.2018/340

work page doi:10.24963/ijcai.2018/340 2018
[27]

Sci Rep15, 7840 (2025) https://doi.org/10.1038/s41598-025-91757-8

Li, Q., Wang, Y., Wang, J., Zhao, C.: Improving drug repositioning accuracy using non-negative matrix tri-factorization. Sci Rep15, 7840 (2025) https://doi.org/10.1038/s41598-025-91757-8

work page doi:10.1038/s41598-025-91757-8 2025
[28]

Nat Commun10, 805 (2019) https://doi.org/10.1038/s41467-019-08797-8

Malod-Dognin, N., Petschnigg, J., Windels, S.F., Povh, J., Hemingway, H., Ketteler, R., Prˇ zulj, N.: Towards a data-integrated cell. Nat Commun10, 805 (2019) https://doi.org/10.1038/s41467-019-08797-8

work page doi:10.1038/s41467-019-08797-8 2019
[29]

Math Program95(2), 407–430 (2003) https://doi.org/10.1007/s10107-002-0355-5

Mittelmann, H.D.: An independent benchmarking of SDP and SOCP solvers. Math Program95(2), 407–430 (2003) https://doi.org/10.1007/s10107-002-0355-5

work page doi:10.1007/s10107-002-0355-5 2003
[30]

Int J Mol Sci25(17), 9576 (2024) https://doi.org/10.3390/ijms25179576

Messa, L., Testa, C., Carelli, S., Rey, F., Jacchetti, E., Cereda, C., Raimondi, M.T., Ceri, S., Pinoli, P.: Non- negative matrix tri-factorization for representation learning in multi-omics datasets with applications to drug repurposing and selection. Int J Mol Sci25(17), 9576 (2024) https://doi.org/10.3390/ijms25179576

work page doi:10.3390/ijms25179576 2024
[31]

In: Proceedings of the 14th Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp

Ma, H., Zhao, W., Tan, Q., Shi, Z.: Orthogonal nonnegative matrix tri-factorization for semi-supervised doc- ument co-clustering. In: Proceedings of the 14th Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 189–200. Springer, Hyderabad, India (2010). https://doi.org/10.1007/978-3-642-13672-6 19

work page doi:10.1007/978-3-642-13672-6 2010
[32]

Knowl-Based Syst201, 106054 (2020) https://doi.org/10.1016/j.knosys.2020.106054

Peng, S., Ser, W., Chen, B., Lin, Z.: Robust orthogonal nonnegative matrix tri-factorization for data representation. Knowl-Based Syst201, 106054 (2020) https://doi.org/10.1016/j.knosys.2020.106054

work page doi:10.1016/j.knosys.2020.106054 2020
[33]

Journal of computational and applied mathematics , volume=

Rousseeuw, P.J.: Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. J 26 Comput Appl Math20, 53–65 (1987) https://doi.org/10.1016/0377-0427(87)90125-7

work page doi:10.1016/0377-0427(87)90125-7 1987
[34]

In: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, pp

Salah, A., Ailem, M., Nadif, M.: Word co-occurrence regularized non-negative matrix tri-factorization for text data co-clustering. In: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, pp. 3992–3999. AAAI Press, New Orleans, LA (2018). https://doi.org/10.1609/aaai.v32i1.11659

work page doi:10.1609/aaai.v32i1.11659 2018
[35]

SIAM J Optimiz20(3), 1364–1377 (2010) https://doi.org/10.1137/070709967

Vavasis, S.A.: On the complexity of nonnegative matrix factorization. SIAM J Optimiz20(3), 1364–1377 (2010) https://doi.org/10.1137/070709967

work page doi:10.1137/070709967 2010
[36]

Stepanov, Anil R

Wang, H., Huang, H., Ding, C.: Simultaneous clustering of multi-type relational data via symmetric nonneg- ative matrix tri-factorization. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, pp. 279–284. Association for Computing Machinery, Glasgow, UK (2011). https://doi.org/10.1145/2063576.2063621

work page doi:10.1145/2063576.2063621 2011
[37]

Inform Process Manag46(5), 559–570 (2010) https://doi.org/10.1016/j.ipm.2009.12.007

Yoo, J., Choi, S.: Orthogonal nonnegative matrix tri-factorization for co-clustering: Multiplicative updates on stiefel manifolds. Inform Process Manag46(5), 559–570 (2010) https://doi.org/10.1016/j.ipm.2009.12.007

work page doi:10.1016/j.ipm.2009.12.007 2010
[38]

Big Data Min Anal8(1), 214–232 (2025) https://doi.org/10.26599/ BDMA.2024.9020055

Yu, J., Che, H., Leung, M.-F., Liu, C., Wu, W., Yan, Z.: Robust non-negative matrix tri-factorization with dual hyper-graph regularization. Big Data Min Anal8(1), 214–232 (2025) https://doi.org/10.26599/ BDMA.2024.9020055

arXiv 2025
[39]

In: Proceedings of the 33rd International Conference on Machine Learning, pp

Yang, Z., Cohen, W., Salakhudinov, R.: Revisiting semi-supervised learning with graph embeddings. In: Proceedings of the 33rd International Conference on Machine Learning, pp. 40–48. JMLR, New York, NY (2016). https://doi.org/10.5555/3045390.3045396

work page doi:10.5555/3045390.3045396 2016
[40]

IEEE Trans Emerg Top Comput Intell9(6) (2025) https://doi.org/10.1109/TETCI.2025.3572129

Zhou, Q., Che, H., Guo, W., He, X., Leung, M.-F., Wen, S.: Robust low-rank tensor constrained orthogonal symmetric non-negative matrix factorization for multi-layer networks community detection. IEEE Trans Emerg Top Comput Intell9(6) (2025) https://doi.org/10.1109/TETCI.2025.3572129

work page doi:10.1109/tetci.2025.3572129 2025
[41]

In: Proceedings of the 2016 6th International Conference on Machinery, Materials, Envi- ronment, Biotechnology and Computer

Zhang, G., Cai, G., Wu, H., Zheng, S.: A nonnegative matrix tri-factorization technique for recommendation in microblog. In: Proceedings of the 2016 6th International Conference on Machinery, Materials, Envi- ronment, Biotechnology and Computer. Atlantis Press, Tianjin, China (2016). https://doi.org/10.2991/ mmebc-16.2016.293

2016
[42]

ACM Comput Surv55(2), 38 (2022) https://doi.org/10.1145/3491206 27

Zhou, J., Liu, L., Wei, W., Fan, J.: Network representation learning: From preprocessing, feature extraction to node embedding. ACM Comput Surv55(2), 38 (2022) https://doi.org/10.1145/3491206 27

work page doi:10.1145/3491206 2022

[1] [1]

J Optimiz Theory App190, 234–258 (2021) https://doi.org/10.1007/s10957-021-01880-5

Ahookhosh, M., Hien, L.T.K., Gillis, N., Patrinos, P.: A block inertial Bregman proximal algorithm for nonsmooth nonconvex problems with application to symmetric nonnegative matrix tri-factorization. J Optimiz Theory App190, 234–258 (2021) https://doi.org/10.1007/s10957-021-01880-5

work page doi:10.1007/s10957-021-01880-5 2021

[2] [2]

Mathematics9(5), 540 (2021)

Asadi, S., Povh, J.: A block coordinate descent-based projected gradient algorithm for orthogonal non- negative matrix factorization. Mathematics9(5), 540 (2021)

2021

[3] [3]

Adv Data Anal Classi13, 825–853 (2019) https://doi.org/10.1007/s11634-018-0348-8

Abe, H., Yadohisa, H.: Orthogonal nonnegative matrix tri-factorization based on Tweedie distributions. Adv Data Anal Classi13, 825–853 (2019) https://doi.org/10.1007/s11634-018-0348-8

work page doi:10.1007/s11634-018-0348-8 2019

[4] [4]

Athena Scientific, Natick, MA (2016) 24

Bertsekas, D.P.: Nonlinear Programming. Athena Scientific, Natick, MA (2016) 24

2016

[5] [5]

IEEE T Knowl Data En35(5), 5132–5146 (2022) https://doi.org/10.1109/ TKDE.2022.3145489

Chen, Y., Lei, Z., Rao, Y., Xie, H., Wang, F.L., Yin, J., Li, Q.: Parallel non-negative matrix tri-factorization for text data co-clustering. IEEE T Knowl Data En35(5), 5132–5146 (2022) https://doi.org/10.1109/ TKDE.2022.3145489

arXiv 2022

[6] [6]

John Wiley & Sons, Hoboken, NJ (2009) ˇCopar, A., ˇZitnik, M., Zupan, B.: Scalable non-negative matrix tri-factorization

Cichocki, A., Zdunek, R., Phan, A.H., Amari, S.-i.: Nonnegative Matrix and Tensor Factorizations: Applica- tions to Exploratory Multi-way Data Analysis and Blind Source Separation. John Wiley & Sons, Hoboken, NJ (2009) ˇCopar, A., ˇZitnik, M., Zupan, B.: Scalable non-negative matrix tri-factorization. Biodata Min10, 41 (2017) https://doi.org/10.1186/s1304...

work page doi:10.1186/s13040-017-0160-6 2009

[7] [7]

Found Comput Math20, 119–154 (2020) https://doi.org/10.1007/s10208-018-09409-5

Davis, D., Drusvyatskiy, D., Kakade, S., Lee, J.D.: Stochastic subgradient method converges on tame functions. Found Comput Math20, 119–154 (2020) https://doi.org/10.1007/s10208-018-09409-5

work page doi:10.1007/s10208-018-09409-5 2020

[8] [8]

In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp

Ding, C., Li, T., Peng, W., Park, H.: Orthogonal nonnegative matrix t-factorizations for clustering. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 126–135. Association for Computing Machinery, Philadelphia, PA (2006). https://doi.org/10. 1145/1150402.1150420

arXiv 2006

[9] [9]

Algorithms12(10) (2019) https://doi.org/10.3390/a12100216

Favati, P., Lotti, G., Menchi, O., Romani, F.: Adaptive clustering via symmetric nonnegative matrix factorization of the similarity matrix. Algorithms12(10) (2019) https://doi.org/10.3390/a12100216

work page doi:10.3390/a12100216 2019

[10] [10]

Society for Industrial and Applied Mathematics, Philadelphia, PA (2020)

Gillis, N.: Nonnegative Matrix Factorization. Society for Industrial and Applied Mathematics, Philadelphia, PA (2020)

2020

[11] [11]

KDD - node2vec: Scalable Feature Learning for Networks,

Grover, A., Leskovec, J.: node2vec: Scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 855–864. Association for Computing Machinery, San Francisco, CA (2016). https://doi.org/10.1145/2939672.2939754 Gligorijevi´ c, V., Malod-Dognin, N., Prˇ zulj, N.: Fuse: Mul...

work page doi:10.1145/2939672.2939754 2016

[12] [12]

In: Proceedings of the 2016 IEEE International Con- ference on Acoustics, Speech and Signal Processing, pp

Hajinezhad, D., Chang, T.-H., Wang, X., Shi, Q., Hong, M.: Nonnegative matrix factorization using ADMM: Algorithm and convergence analysis. In: Proceedings of the 2016 IEEE International Con- ference on Acoustics, Speech and Signal Processing, pp. 4742–4746. IEEE, Shanghai, China (2016). https://doi.org/10.1109/ICASSP.2016.7472577

work page doi:10.1109/icassp.2016.7472577 2016

[13] [13]

J Global Optim82(2), 283–312 (2022) https://doi.org/10.1007/s10898-021-01074-3

Hribar, R., Hrga, T., Papa, G., Petelin, G., Povh, J., Prˇ zulj, N., Vukaˇ sinovi´ c, V.: Four algorithms to solve symmetric multi-type non-negative matrix tri-factorization problem. J Global Optim82(2), 283–312 (2022) https://doi.org/10.1007/s10898-021-01074-3

work page doi:10.1007/s10898-021-01074-3 2022

[14] [14]

SIAM Rev53(2), 217–288 (2011) https://doi.org/10

Halko, N., Martinsson, P.-G., Tropp, J.A.: Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions. SIAM Rev53(2), 217–288 (2011) https://doi.org/10. 1137/090771806

2011

[15] [15]

https://www.cs.toronto.edu/∼tijmen/csc321/slides/lecture slides lec6.pdf (2012)

Hinton, G., Srivastava, N., Swersky, K.: Neural networks for machine learning (Lecture 6a): Overview of mini- batch gradient descent. https://www.cs.toronto.edu/∼tijmen/csc321/slides/lecture slides lec6.pdf (2012)

2012

[16] [16]

Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization (2014) arXiv:1412.6980 [cs.LG]

Pith/arXiv arXiv 2014

[17] [17]

SIAM J Matrix Anal A30(2), 713–730 (2008) https://doi.org/10.1137/ 07069239X 25

Kim, H., Park, H.: Nonnegative matrix factorization based on alternating nonnegativity constrained least squares and active set method. SIAM J Matrix Anal A30(2), 713–730 (2008) https://doi.org/10.1137/ 07069239X 25

2008

[18] [18]

Method Inform Med55(4), 340–346 (2016) https://doi.org/10.3414/ ME15-01-0108

Kastrin, A., Rindflesch, T.C., Hristovski, D.: Link prediction on a network of co-occurring MeSH terms: Towards literature-based discovery. Method Inform Med55(4), 340–346 (2016) https://doi.org/10.3414/ ME15-01-0108

2016

[19] [19]

Physica A553, 124289 (2020) https://doi.org/10.1016/j.physa.2020.124289

Kumar, A., Singh, S.S., Singh, K., Biswas, B.: Link prediction techniques, applications, and performance: A survey. Physica A553, 124289 (2020) https://doi.org/10.1016/j.physa.2020.124289

work page doi:10.1016/j.physa.2020.124289 2020

[20] [20]

Knowl-Based Syst283, 111192 (2024) https://doi.org/10.1016/j.knosys

Kong, Q., Sun, J., Xu, Z.: Joint orthogonal symmetric non-negative matrix factorization for community detection in attribute network. Knowl-Based Syst283, 111192 (2024) https://doi.org/10.1016/j.knosys. 2023.111192

work page doi:10.1016/j.knosys 2024

[21] [21]

Bioinformatics 34(4), 652–659 (2018) https://doi.org/10.1093/bioinformatics/btx613

Lever, J., Gakkhar, S., Gottlieb, M., Rashnavadi, T., Lin, S., Siu, C., Smith, M., Jones, M.R., Krzywinski, M., Jones, S.J.: A collaborative filtering-based approach to biomedical knowledge discovery. Bioinformatics 34(4), 652–659 (2018) https://doi.org/10.1093/bioinformatics/btx613

work page doi:10.1093/bioinformatics/btx613 2018

[22] [22]

Neural Computation 19(10), 2756–2779 (2007)

Lin, C.-J.: Projected gradient methods for nonnegative matrix factorization. Neural Comput19(10), 2756– 2779 (2007) https://doi.org/10.1162/neco.2007.19.10.2756

work page doi:10.1162/neco.2007.19.10.2756 2007

[23] [23]

IEEE T Neur Net Lear33(3), 1203–1215 (2022) https: //doi.org/10.1109/TNNLS.2020.3041360

Luo, X., Liu, Z., Jin, L., Zhou, Y., Zhou, M.: Symmetric nonnegative matrix factorization-based community detection models and their convergence analysis. IEEE T Neur Net Lear33(3), 1203–1215 (2022) https: //doi.org/10.1109/TNNLS.2020.3041360

work page doi:10.1109/tnnls.2020.3041360 2022

[24] [24]

J Am Soc Inf Sci Tec58(7), 1019–1031 (2007) https://doi.org/10.1002/asi.20591

Liben-Nowell, D., Kleinberg, J.: The link-prediction problem for social networks. J Am Soc Inf Sci Tec58(7), 1019–1031 (2007) https://doi.org/10.1002/asi.20591

work page doi:10.1002/asi.20591 2007

[25] [25]

In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, pp

Liu, K., Wang, H.: Robust multi-relational clustering viaℓ 1-norm symmetric nonnegative matrix factoriza- tion. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, pp. 397–401. Association for Computational Linguistics, Beijing, China (2015)....

work page doi:10.3115/v1/p15-2065 2015

[26] [26]

https://doi.org/10.24963/ijcai.2018/340

Stockholm, Sweden (2018). https://doi.org/10.24963/ijcai.2018/340

work page doi:10.24963/ijcai.2018/340 2018

[27] [27]

Sci Rep15, 7840 (2025) https://doi.org/10.1038/s41598-025-91757-8

Li, Q., Wang, Y., Wang, J., Zhao, C.: Improving drug repositioning accuracy using non-negative matrix tri-factorization. Sci Rep15, 7840 (2025) https://doi.org/10.1038/s41598-025-91757-8

work page doi:10.1038/s41598-025-91757-8 2025

[28] [28]

Nat Commun10, 805 (2019) https://doi.org/10.1038/s41467-019-08797-8

Malod-Dognin, N., Petschnigg, J., Windels, S.F., Povh, J., Hemingway, H., Ketteler, R., Prˇ zulj, N.: Towards a data-integrated cell. Nat Commun10, 805 (2019) https://doi.org/10.1038/s41467-019-08797-8

work page doi:10.1038/s41467-019-08797-8 2019

[29] [29]

Math Program95(2), 407–430 (2003) https://doi.org/10.1007/s10107-002-0355-5

Mittelmann, H.D.: An independent benchmarking of SDP and SOCP solvers. Math Program95(2), 407–430 (2003) https://doi.org/10.1007/s10107-002-0355-5

work page doi:10.1007/s10107-002-0355-5 2003

[30] [30]

Int J Mol Sci25(17), 9576 (2024) https://doi.org/10.3390/ijms25179576

Messa, L., Testa, C., Carelli, S., Rey, F., Jacchetti, E., Cereda, C., Raimondi, M.T., Ceri, S., Pinoli, P.: Non- negative matrix tri-factorization for representation learning in multi-omics datasets with applications to drug repurposing and selection. Int J Mol Sci25(17), 9576 (2024) https://doi.org/10.3390/ijms25179576

work page doi:10.3390/ijms25179576 2024

[31] [31]

In: Proceedings of the 14th Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp

Ma, H., Zhao, W., Tan, Q., Shi, Z.: Orthogonal nonnegative matrix tri-factorization for semi-supervised doc- ument co-clustering. In: Proceedings of the 14th Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 189–200. Springer, Hyderabad, India (2010). https://doi.org/10.1007/978-3-642-13672-6 19

work page doi:10.1007/978-3-642-13672-6 2010

[32] [32]

Knowl-Based Syst201, 106054 (2020) https://doi.org/10.1016/j.knosys.2020.106054

Peng, S., Ser, W., Chen, B., Lin, Z.: Robust orthogonal nonnegative matrix tri-factorization for data representation. Knowl-Based Syst201, 106054 (2020) https://doi.org/10.1016/j.knosys.2020.106054

work page doi:10.1016/j.knosys.2020.106054 2020

[33] [33]

Journal of computational and applied mathematics , volume=

Rousseeuw, P.J.: Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. J 26 Comput Appl Math20, 53–65 (1987) https://doi.org/10.1016/0377-0427(87)90125-7

work page doi:10.1016/0377-0427(87)90125-7 1987

[34] [34]

In: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, pp

Salah, A., Ailem, M., Nadif, M.: Word co-occurrence regularized non-negative matrix tri-factorization for text data co-clustering. In: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, pp. 3992–3999. AAAI Press, New Orleans, LA (2018). https://doi.org/10.1609/aaai.v32i1.11659

work page doi:10.1609/aaai.v32i1.11659 2018

[35] [35]

SIAM J Optimiz20(3), 1364–1377 (2010) https://doi.org/10.1137/070709967

Vavasis, S.A.: On the complexity of nonnegative matrix factorization. SIAM J Optimiz20(3), 1364–1377 (2010) https://doi.org/10.1137/070709967

work page doi:10.1137/070709967 2010

[36] [36]

Stepanov, Anil R

Wang, H., Huang, H., Ding, C.: Simultaneous clustering of multi-type relational data via symmetric nonneg- ative matrix tri-factorization. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, pp. 279–284. Association for Computing Machinery, Glasgow, UK (2011). https://doi.org/10.1145/2063576.2063621

work page doi:10.1145/2063576.2063621 2011

[37] [37]

Inform Process Manag46(5), 559–570 (2010) https://doi.org/10.1016/j.ipm.2009.12.007

Yoo, J., Choi, S.: Orthogonal nonnegative matrix tri-factorization for co-clustering: Multiplicative updates on stiefel manifolds. Inform Process Manag46(5), 559–570 (2010) https://doi.org/10.1016/j.ipm.2009.12.007

work page doi:10.1016/j.ipm.2009.12.007 2010

[38] [38]

Big Data Min Anal8(1), 214–232 (2025) https://doi.org/10.26599/ BDMA.2024.9020055

Yu, J., Che, H., Leung, M.-F., Liu, C., Wu, W., Yan, Z.: Robust non-negative matrix tri-factorization with dual hyper-graph regularization. Big Data Min Anal8(1), 214–232 (2025) https://doi.org/10.26599/ BDMA.2024.9020055

arXiv 2025

[39] [39]

In: Proceedings of the 33rd International Conference on Machine Learning, pp

Yang, Z., Cohen, W., Salakhudinov, R.: Revisiting semi-supervised learning with graph embeddings. In: Proceedings of the 33rd International Conference on Machine Learning, pp. 40–48. JMLR, New York, NY (2016). https://doi.org/10.5555/3045390.3045396

work page doi:10.5555/3045390.3045396 2016

[40] [40]

IEEE Trans Emerg Top Comput Intell9(6) (2025) https://doi.org/10.1109/TETCI.2025.3572129

Zhou, Q., Che, H., Guo, W., He, X., Leung, M.-F., Wen, S.: Robust low-rank tensor constrained orthogonal symmetric non-negative matrix factorization for multi-layer networks community detection. IEEE Trans Emerg Top Comput Intell9(6) (2025) https://doi.org/10.1109/TETCI.2025.3572129

work page doi:10.1109/tetci.2025.3572129 2025

[41] [41]

In: Proceedings of the 2016 6th International Conference on Machinery, Materials, Envi- ronment, Biotechnology and Computer

Zhang, G., Cai, G., Wu, H., Zheng, S.: A nonnegative matrix tri-factorization technique for recommendation in microblog. In: Proceedings of the 2016 6th International Conference on Machinery, Materials, Envi- ronment, Biotechnology and Computer. Atlantis Press, Tianjin, China (2016). https://doi.org/10.2991/ mmebc-16.2016.293

2016

[42] [42]

ACM Comput Surv55(2), 38 (2022) https://doi.org/10.1145/3491206 27

Zhou, J., Liu, L., Wei, W., Fan, J.: Network representation learning: From preprocessing, feature extraction to node embedding. ACM Comput Surv55(2), 38 (2022) https://doi.org/10.1145/3491206 27

work page doi:10.1145/3491206 2022