Learning Disentangled Representations for Generalized Multi-view Clustering

Chang Tang; Kunlun He; Ruimeng Liu; Wanqing Li; Xinwang Liu; Xin Zou; Zhenglai Li

arxiv: 2605.15640 · v1 · pith:JUJ3EJG4new · submitted 2026-05-15 · 💻 cs.CV

Learning Disentangled Representations for Generalized Multi-view Clustering

Xin Zou , Ruimeng Liu , Chang Tang , Zhenglai Li , Xinwang Liu , Kunlun He , Wanqing Li This is my paper

Pith reviewed 2026-05-20 18:46 UTC · model grok-4.3

classification 💻 cs.CV

keywords multi-view clusteringdisentangled representationsautoencodersadversarial learningmutual informationincomplete viewsclustering performance

0 comments

The pith

Dual-path autoencoders separate view-specific and shared features to improve multi-view clustering.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes a Generalized Multi-view Auto-Encoder that learns disentangled representations by routing source features through dual paths, one for view-specific details and one for view-common structure. Adversarial discriminators push the specific paths to become more discriminative while mutual information modulation keeps the common path aligned and non-collapsed. This setup is tested on both full and partial view scenarios across many standard datasets, where it produces tighter clusters than prior fusion approaches.

Core claim

GMAE decouples source features into view-specific and view-common embeddings through dual-path autoencoders. Cross-view adversarial discriminators guide the specific encoders toward more discriminative features, while mutual information modulation aligns distributions across views and avoids trivial solutions, yielding robust embeddings that support higher-quality clustering even when some views are missing.

What carries the argument

Dual-path autoencoders that split features into view-specific and view-common embeddings, steered by adversarial discriminators and mutual information modulation.

Load-bearing premise

That separating view-specific and view-common information through dual autoencoder paths will keep complementary details intact while reducing entanglement during fusion.

What would settle it

Clustering accuracy or normalized mutual information would fail to rise, or would drop, when the dual-path split or the mutual information term is removed from the model on the same 13 benchmark collections.

Figures

Figures reproduced from arXiv: 2605.15640 by Chang Tang, Kunlun He, Ruimeng Liu, Wanqing Li, Xinwang Liu, Xin Zou, Zhenglai Li.

**Figure 2.** Figure 2: The t-SNE visualization [37] results of feature representations on the STL-10 dataset for different SOTA methods. [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗

**Figure 3.** Figure 3: The flowchart of our proposed GMAE. Specifically, given the multi-view feature matrix, GMAE first employs [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: (a-d) An analysis of the training process, including complete and incomplete multi-view datasets, respectively. (a) Dermatology (ACC) (b) Dermatology (NMI) (c) MSRCV1 (ACC) (d) MSRCV1 (NMI) [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗

**Figure 5.** Figure 5: Effect of different missing radio on (ACC/NMI) evaluation metrics on Dermatology and MSRCV1 datasets. [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗

**Figure 6.** Figure 6: The visualization results of feature representations on [PITH_FULL_IMAGE:figures/full_fig_p010_6.png] view at source ↗

**Figure 7.** Figure 7: Visualization of embedded features Q. 4.3 Ablation Study The Z, H, and C embeddings play distinct roles at different stages in GMAE, reflecting the progressive refinement of feature learning. The t-SNE visualizations in [PITH_FULL_IMAGE:figures/full_fig_p010_7.png] view at source ↗

**Figure 8.** Figure 8: Hyperparameter sensitivity analysis. The clustering results (ACC/NMI) vary with different values of α and β. (a) BRCA (ACC) (b) BRCA (NMI) (c) LGG (PUR) (d) LGG (NMI) [PITH_FULL_IMAGE:figures/full_fig_p011_8.png] view at source ↗

**Figure 9.** Figure 9: The hyperparameter sensitivity analysis on the omics-based MVC datasets varies across different values of α and β. deeper insights into the evolution and function of these features. The view-specific representations Z are independently generated by the encoders for each view, capturing the unique characteristics of individual view inputs. While these features effectively capture view-specific content, the… view at source ↗

**Figure 10.** Figure 10: (a-d) Effect of embedding dimension d on a range of evaluation metrics across different MVC datasets. dataset, slight overlaps appear between certain categories, which can be inferred to result from the similarity of cluster features across multiple views within the original dataset. In summary, the integrated features extracted by our proposed GMAE successfully produce dense clusters with well-defined bo… view at source ↗

read the original abstract

Multi-View Clustering (MVC) has gained significant attention for its ability to leverage complementary information across diverse views. However, existing deep MVC methods often struggle with view-distribution entanglement during cross-view fusion, which hampers the quality of the shared latent space and leads to suboptimal Figures. To address this issue, we propose the Generalized Multi-view Auto-Encoder (GMAE), a framework designed to preserve cross-view complementarity through disentangled representation learning. Specifically, GMAE employs dual-path autoencoders to decouple source features into view-specific and view-common embeddings, facilitating the discovery of clearer clustering structures. We further construct cross-view adversarial discriminators to guide view-specific encoders in capturing more discriminative features. By strategically modulating mutual information, GMAE effectively aligns distributions and prevents representation collapse, ensuring the generation of robust, non-trivial embeddings. Comprehensive experiments on 13 benchmark datasets demonstrate that GMAE consistently outperforms state-of-the-art methods in both complete and incomplete MVC tasks. Our code implementation is available at the repository: https://github.com/obananas/GMAE.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

GMAE combines dual-path autoencoders with adversarial discriminators and mutual information modulation for multi-view clustering, including incomplete cases, and reports gains on 13 datasets, but the mechanism's contribution lacks direct verification.

read the letter

The paper introduces GMAE, a framework that runs dual-path autoencoders to split input features into view-specific and view-common embeddings, then adds cross-view adversarial discriminators and mutual information modulation to align things and avoid collapse. It targets both complete and incomplete multi-view clustering and shows better numbers than prior methods across 13 benchmarks, with code released on GitHub. That extension to incomplete views is the clearest practical step forward here, since many disentanglement approaches stay limited to full-view settings. The experiments appear broad enough to give a sense of where the method lands in current benchmarks. Releasing the implementation is also useful for anyone who wants to test it directly. The central claim holds up in the reported results, though the gains are described as consistent rather than dramatic. The soft spot is the missing checks on the modulation step itself. The abstract does not include ablations that remove the mutual information term or report measured mutual information values between the specific and common embeddings, so it remains unclear whether the alignment prevents loss of complementary information or simply adds capacity that helps on these particular datasets. Evaluation details on hyperparameter sensitivity and protocol controls are also thin from what is shown. This work is aimed at people who run multi-view clustering pipelines in vision or sensor applications and need something that handles missing views without heavy redesign. A reader focused on incremental engineering improvements in representation learning for clustering would find usable ideas. The paper engages the existing disentanglement and MVC literature directly and avoids obvious internal contradictions, so it shows honest engagement with the problem. I would send it for peer review. The empirical coverage and code availability make it worth referee time, even if revisions will likely need stronger evidence that the disentanglement components drive the reported improvements rather than model complexity alone.

Referee Report

2 major / 2 minor

Summary. The paper proposes the Generalized Multi-view Auto-Encoder (GMAE) framework for multi-view clustering (MVC). It decouples source features into view-specific and view-common embeddings via dual-path autoencoders, employs cross-view adversarial discriminators to enhance discriminativeness, and uses mutual information modulation to align distributions and prevent representation collapse. The central claim is that this disentanglement preserves cross-view complementarity and yields clearer clustering structures, with consistent outperformance over state-of-the-art methods on 13 benchmark datasets in both complete and incomplete MVC settings. Code is released at a public repository.

Significance. If the disentanglement mechanism and MI modulation are shown to function as described without inadvertently discarding discriminative information, the work could advance deep MVC by providing a constructive way to handle view-distribution entanglement while retaining complementarity. The release of code supports reproducibility and allows direct verification of the reported gains on the 13 datasets.

major comments (2)

[Method (dual-path autoencoders and MI modulation)] The central performance claims on incomplete MVC tasks rest on the effectiveness of mutual information modulation in preserving complementarity without collapse or loss of discriminative information. However, the method description provides no quantitative verification such as measured MI values between view-specific and view-common embeddings or ablation studies removing the modulation term; without these, it remains possible that reported gains arise from increased model capacity rather than the claimed disentanglement.
[Experiments] §4 (Experiments): The abstract states consistent outperformance on 13 datasets for both complete and incomplete settings, yet the evaluation protocols, hyperparameter sensitivity analysis, and controls for post-hoc choices (e.g., clustering algorithm parameters or view selection in incomplete cases) are not detailed. This undermines confidence in the robustness of the cross-dataset superiority claim.

minor comments (2)

[Abstract] Abstract: 'suboptimal Figures' appears to be a typo and should be clarified (likely intended as 'results' or 'performance').
[Method] The handling of incomplete views is mentioned but lacks explicit description of how the dual-path architecture and discriminators are adapted when views are missing; a dedicated subsection or figure would improve clarity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments and the opportunity to improve our manuscript. We address each of the major comments point by point below, indicating the revisions we intend to make in the next version.

read point-by-point responses

Referee: [Method (dual-path autoencoders and MI modulation)] The central performance claims on incomplete MVC tasks rest on the effectiveness of mutual information modulation in preserving complementarity without collapse or loss of discriminative information. However, the method description provides no quantitative verification such as measured MI values between view-specific and view-common embeddings or ablation studies removing the modulation term; without these, it remains possible that reported gains arise from increased model capacity rather than the claimed disentanglement.

Authors: We acknowledge the importance of providing quantitative evidence for the mutual information modulation. In the revised manuscript, we will add ablation studies that isolate the effect of the MI modulation term by removing it and comparing performance. Additionally, we will include measurements of mutual information between the view-specific and view-common embeddings to verify the disentanglement and show that discriminative information is preserved. These additions will help demonstrate that the performance gains stem from the proposed disentanglement mechanism. revision: yes
Referee: [Experiments] §4 (Experiments): The abstract states consistent outperformance on 13 datasets for both complete and incomplete settings, yet the evaluation protocols, hyperparameter sensitivity analysis, and controls for post-hoc choices (e.g., clustering algorithm parameters or view selection in incomplete cases) are not detailed. This undermines confidence in the robustness of the cross-dataset superiority claim.

Authors: We agree that providing more details on the experimental setup is necessary to support the robustness of our claims. In the revision, we will expand the Experiments section to include a thorough description of the evaluation protocols used across the 13 datasets, a hyperparameter sensitivity analysis, and explicit controls for post-hoc choices including clustering algorithm parameters and view selection procedures in incomplete MVC settings. This will enhance the reproducibility and confidence in the reported results. revision: yes

Circularity Check

0 steps flagged

No circularity: GMAE is a constructive empirical framework with no derivation chain reducing to self-defined inputs

full rationale

The paper introduces GMAE as a new architecture employing dual-path autoencoders for disentangling view-specific and view-common embeddings, cross-view adversarial discriminators, and mutual information modulation to address entanglement in multi-view clustering. All claims rest on experimental validation across 13 datasets rather than any first-principles derivation, uniqueness theorem, or parameter fit that is then relabeled as a prediction. No equations or steps in the provided description reduce by construction to quantities defined from the method's own outputs or prior self-citations; the approach is presented as an independent constructive solution.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The abstract invokes standard deep-learning assumptions about the benefits of disentanglement and adversarial alignment without introducing new free parameters, axioms, or invented entities beyond the proposed architecture itself.

pith-pipeline@v0.9.0 · 5724 in / 1094 out tokens · 48149 ms · 2026-05-20T18:46:34.422051+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

By strategically modulating mutual information, GMAE effectively aligns distributions and prevents representation collapse... dual-path autoencoders to decouple source features into view-specific and view-common embeddings
IndisputableMonolith/Foundation/ArithmeticFromLogic.lean embed_injective unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We rigorously derive an optimizable loss function based on the task of mutual information estimation... disentangled representations learned by GMAE framework contain more cluster-relevant information

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

90 extracted references · 90 canonical work pages · 5 internal anchors

[1]

Multi-view discriminant analysis,

M. Kan, S. Shan, H. Zhang, S. Lao, and X. Chen, “Multi-view discriminant analysis,”IEEE transactions on pattern analysis and machine intelligence, vol. 38, no. 1, pp. 188–194, 2015

work page 2015
[2]

An information theo- retic framework for multi-view learning,

K. Sridharan and S. M. Kakade, “An information theo- retic framework for multi-view learning,” inCOLT, no. 114, 2008, pp. 403–414

work page 2008
[3]

A comprehensive survey on multi-view clustering,

U. Fang, M. Li, J. Li, L. Gao, T. Jia, and Y. Zhang, “A comprehensive survey on multi-view clustering,”IEEE IEEE TRANSACTIONS ON PATTERN ANAL YSIS AND MACHINE INTELLIGENCE 13 Transactions on Knowledge and Data Engineering, vol. 35, no. 12, pp. 12 350–12 368, 2023

work page 2023
[4]

Multi-view unsuper- vised user feature embedding for social media-based substance use prediction,

T. Ding, W. K. Bickel, and S. Pan, “Multi-view unsuper- vised user feature embedding for social media-based substance use prediction,” inProceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017, pp. 2275–2284

work page 2017
[5]

Multi-omic data analysis using galaxy,

J. Boekel, J. M. Chilton, I. R. Cooke, P . L. Horvatovich, P . D. Jagtap, L. K ¨all, J. Lehti ¨o, P . Lukasse, P . D. Moer- land, and T. J. Griffin, “Multi-omic data analysis using galaxy,”Nature biotechnology, vol. 33, no. 2, pp. 137–139, 2015

work page 2015
[6]

Hierarchical attention learning for multimodal classi- fication,

X. Zou, C. Tang, W. Zhang, K. Sun, and L. Jiang, “Hierarchical attention learning for multimodal classi- fication,” in2023 IEEE International Conference on Mul- timedia and Expo (ICME). IEEE, 2023, pp. 936–941

work page 2023
[7]

Dpnet: Dynamic poly-attention network for trustworthy multi-modal classification,

X. Zou, C. Tang, X. Zheng, Z. Li, X. He, S. An, and X. Liu, “Dpnet: Dynamic poly-attention network for trustworthy multi-modal classification,” inProceedings of the 31st ACM International Conference on Multimedia, 2023, pp. 3550–3559

work page 2023
[8]

Dai-net: Dual adaptive interaction network for coordinated medication recommendation,

X. Zou, X. He, X. Zheng, W. Zhang, J. Chen, and C. Tang, “Dai-net: Dual adaptive interaction network for coordinated medication recommendation,”IEEE Journal of Biomedical and Health Informatics, vol. 28, pp. 6201–6211, 2024

work page 2024
[9]

Modality-aware mutual learning for multi- modal medical image segmentation,

Y. Zhang, J. Yang, J. Tian, Z. Shi, C. Zhong, Y. Zhang, and Z. He, “Modality-aware mutual learning for multi- modal medical image segmentation,” inMedical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part I 24. Springer, 2021, pp. 589–599

work page 2021
[10]

Reconsidering representation alignment for multi- view clustering,

D. J. Trosten, S. Lokse, R. Jenssen, and M. Kampffmeyer, “Reconsidering representation alignment for multi- view clustering,” inProceedings of the IEEE/CVF confer- ence on computer vision and pattern recognition, 2021, pp. 1255–1265

work page 2021
[11]

Consensus graph learning for multi-view clustering,

Z. Li, C. Tang, X. Liu, X. Zheng, W. Zhang, and E. Zhu, “Consensus graph learning for multi-view clustering,” IEEE Transactions on Multimedia, vol. 24, pp. 2461–2472, 2021

work page 2021
[12]

Adaptive feature projection with distribution alignment for deep incomplete multi-view clustering,

J. Xu, C. Li, L. Peng, Y. Ren, X. Shi, H. T. Shen, and X. Zhu, “Adaptive feature projection with distribution alignment for deep incomplete multi-view clustering,” IEEE Transactions on Image Processing, vol. 32, pp. 1354– 1366, 2023

work page 2023
[13]

From concrete to abstract: Multi-view clustering on relational knowledge,

K. Liang, L. Meng, H. Li, J. Wang, L. Lan, M. Li, X. Liu, and H. Wang, “From concrete to abstract: Multi-view clustering on relational knowledge,”IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1–18, 2025

work page 2025
[14]

One-pass multi-view clustering for large- scale data,

J. Liu, X. Liu, Y. Yang, L. Liu, S. Wang, W. Liang, and J. Shi, “One-pass multi-view clustering for large- scale data,” inProceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 12 344–12 353

work page 2021
[15]

Orthogo- nal non-negative tensor factorization based multi-view clustering,

J. Li, Q. Gao, Q. Wang, M. Yang, and W. Xia, “Orthogo- nal non-negative tensor factorization based multi-view clustering,”Advances in Neural Information Processing Systems, vol. 36, 2024

work page 2024
[16]

A survey of knowl- edge graph reasoning on graph types: Static, dynamic, and multi-modal,

K. Liang, L. Meng, M. Liu, Y. Liu, W. Tu, S. Wang, S. Zhou, X. Liu, F. Sun, and K. He, “A survey of knowl- edge graph reasoning on graph types: Static, dynamic, and multi-modal,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 46, no. 12, pp. 9456–9478, 2024

work page 2024
[17]

Gmc: Graph-based multi-view clustering,

H. Wang, Y. Yang, and B. Liu, “Gmc: Graph-based multi-view clustering,”IEEE Transactions on Knowledge and Data Engineering, vol. 32, no. 6, pp. 1116–1129, 2019

work page 2019
[18]

Inclusivity induced adaptive graph learning for multi-view clustering,

X. Zou, C. Tang, X. Zheng, K. Sun, W. Zhang, and D. Ding, “Inclusivity induced adaptive graph learning for multi-view clustering,”Knowledge-Based Systems, vol. 267, p. 110424, 2023

work page 2023
[19]

Multi-view contrastive graph clustering,

E. Pan and Z. Kang, “Multi-view contrastive graph clustering,”Advances in neural information processing systems, vol. 34, pp. 2148–2159, 2021

work page 2021
[20]

Unified one-step multi-view spectral clustering,

C. Tang, Z. Li, J. Wang, X. Liu, W. Zhang, and E. Zhu, “Unified one-step multi-view spectral clustering,”IEEE Transactions on Knowledge and Data Engineering, vol. 35, no. 6, pp. 6449–6460, 2022

work page 2022
[21]

Diversity-induced multi-view subspace clustering,

X. Cao, C. Zhang, H. Fu, S. Liu, and H. Zhang, “Diversity-induced multi-view subspace clustering,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 586–594

work page 2015
[22]

Multi-view sub- space clustering,

H. Gao, F. Nie, X. Li, and H. Huang, “Multi-view sub- space clustering,” inProceedings of the IEEE international conference on computer vision, 2015, pp. 4238–4246

work page 2015
[23]

Generalized latent multi-view subspace clus- tering,

C. Zhang, H. Fu, Q. Hu, X. Cao, Y. Xie, D. Tao, and D. Xu, “Generalized latent multi-view subspace clus- tering,”IEEE transactions on pattern analysis and machine intelligence, vol. 42, no. 1, pp. 86–99, 2018

work page 2018
[24]

A survey on multiview clustering,

G. Chao, S. Sun, and J. Bi, “A survey on multiview clustering,”IEEE transactions on artificial intelligence, vol. 2, no. 2, pp. 146–168, 2021

work page 2021
[25]

Deep adversarial multi-view clustering network

Z. Li, Q. Wang, Z. Tao, Q. Gao, Z. Yanget al., “Deep adversarial multi-view clustering network.” inIJCAI, vol. 2, no. 3, 2019, p. 4

work page 2019
[26]

Deep safe incomplete multi-view clustering: Theorem and algorithm,

H. Tang and Y. Liu, “Deep safe incomplete multi-view clustering: Theorem and algorithm,” inInternational Conference on Machine Learning. PMLR, 2022, pp. 21 090–21 110

work page 2022
[27]

Dual alignment feature embedding network for multi- omics data clustering,

Y. Xiao, D. Yang, J. Li, X. Zou, H. Zhou, and C. Tang, “Dual alignment feature embedding network for multi- omics data clustering,”Knowledge-Based Systems, vol. 309, p. 112774, 2025

work page 2025
[28]

On the effects of self-supervision and contrastive alignment in deep multi-view clustering,

D. J. Trosten, S. Løkse, R. Jenssen, and M. C. Kampffmeyer, “On the effects of self-supervision and contrastive alignment in deep multi-view clustering,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2023, pp. 23 976–23 985

work page 2023
[29]

A novel approach for effective multi-view clustering with information-theoretic perspective,

C. Cui, Y. Ren, J. Pu, J. Li, X. Pu, T. Wu, Y. Shi, and L. He, “A novel approach for effective multi-view clustering with information-theoretic perspective,”Advances in Neural Information Processing Systems, vol. 36, 2024

work page 2024
[30]

Trusted mamba contrastive network for multi-view clustering,

J. Zhu, X. Zou, L. Liu, Z. Huang, Y. Zhang, C. Tang, and L.-R. Dai, “Trusted mamba contrastive network for multi-view clustering,”arXiv preprint arXiv:2412.16487, 2024

work page arXiv 2024
[31]

Rethinking multi-view representation learning via distilled dis- entangling,

G. Ke, B. Wang, X. Wang, and S. He, “Rethinking multi-view representation learning via distilled dis- entangling,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. IEEE TRANSACTIONS ON PATTERN ANAL YSIS AND MACHINE INTELLIGENCE 14 26 774–26 783

work page 2024
[33]

Com- pleter: Incomplete multi-view clustering via contrastive prediction,

Y. Lin, Y. Gou, Z. Liu, B. Li, J. Lv, and X. Peng, “Com- pleter: Incomplete multi-view clustering via contrastive prediction,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 11 174– 11 183

work page 2021
[34]

Gcfagg: Global and cross-view feature aggregation for multi-view clustering,

W. Yan, Y. Zhang, C. Lv, C. Tang, G. Yue, L. Liao, and W. Lin, “Gcfagg: Global and cross-view feature aggregation for multi-view clustering,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 19 863–19 872

work page 2023
[35]

Self-weighted contrastive fusion for deep multi-view clustering,

S. Wu, Y. Zheng, Y. Ren, J. He, X. Pu, S. Huang, Z. Hao, and L. He, “Self-weighted contrastive fusion for deep multi-view clustering,”IEEE Transactions on Multimedia, 2024

work page 2024
[36]

Investigating and mitigating the side effects of noisy views for self-supervised clustering algorithms in practical multi-view scenarios,

J. Xu, Y. Ren, X. Wang, L. Feng, Z. Zhang, G. Niu, and X. Zhu, “Investigating and mitigating the side effects of noisy views for self-supervised clustering algorithms in practical multi-view scenarios,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 22 957–22 966

work page 2024
[37]

Visualizing data using t-sne

L. Van der Maaten and G. Hinton, “Visualizing data using t-sne.”Journal of machine learning research, vol. 9, no. 11, 2008

work page 2008
[38]

Autoencoders, minimum description length and helmholtz free energy,

G. E. Hinton and R. Zemel, “Autoencoders, minimum description length and helmholtz free energy,”Ad- vances in neural information processing systems, vol. 6, 1993

work page 1993
[39]

Unsupervised deep embedding for clustering analysis,

J. Xie, R. Girshick, and A. Farhadi, “Unsupervised deep embedding for clustering analysis,” inInternational con- ference on machine learning. PMLR, 2016, pp. 478–487

work page 2016
[40]

Improved deep em- bedded clustering with local structure preservation

X. Guo, L. Gao, X. Liu, and J. Yin, “Improved deep em- bedded clustering with local structure preservation.” in Ijcai, vol. 17, 2017, pp. 1753–1759

work page 2017
[41]

Variational Deep Embedding: An Unsupervised and Generative Approach to Clustering

Z. Jiang, Y. Zheng, H. Tan, B. Tang, and H. Zhou, “Variational deep embedding: An unsupervised and generative approach to clustering,”arXiv preprint arXiv:1611.05148, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[42]

Deep Unsupervised Clustering with Gaussian Mixture Variational Autoencoders

N. Dilokthanakul, P . A. Mediano, M. Garnelo, M. C. Lee, H. Salimbeni, K. Arulkumaran, and M. Shana- han, “Deep unsupervised clustering with gaussian mixture variational autoencoders,”arXiv preprint arXiv:1611.02648, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[43]

Multi-vae: Learning disentangled view- common and view-peculiar visual representations for multi-view clustering,

J. Xu, Y. Ren, H. Tang, X. Pu, X. Zhu, M. Zeng, and L. He, “Multi-vae: Learning disentangled view- common and view-peculiar visual representations for multi-view clustering,” inProceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 9234–9243

work page 2021
[44]

Generative adversarial nets,

I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,”Advances in neural infor- mation processing systems, vol. 27, 2014

work page 2014
[45]

Generative adversarial networks,

——, “Generative adversarial networks,”Communica- tions of the ACM, vol. 63, no. 11, pp. 139–144, 2020

work page 2020
[46]

Intriguing properties of synthetic im- ages: from generative adversarial networks to diffusion models,

R. Corvi, D. Cozzolino, G. Poggi, K. Nagano, and L. Verdoliva, “Intriguing properties of synthetic im- ages: from generative adversarial networks to diffusion models,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 973– 982

work page 2023
[47]

Dual adversarial autoencoders for clustering,

P . Ge, C.-X. Ren, D.-Q. Dai, J. Feng, and S. Yan, “Dual adversarial autoencoders for clustering,”IEEE trans- actions on neural networks and learning systems, vol. 31, no. 4, pp. 1417–1424, 2019

work page 2019
[48]

Sparsemvc: Probing cross-view sparsity variations for multi-view clustering,

R. Liu, X. Zou, C. Tang, X. Zheng, X. Hu, K. Sun, and X. Liu, “Sparsemvc: Probing cross-view sparsity variations for multi-view clustering,” inThe Thirty- ninth Annual Conference on Neural Information Processing Systems, 2025

work page 2025
[49]

Representation learning in multi-view clustering: A literature review,

M.-S. Chen, J.-Q. Lin, X.-L. Li, B.-Y. Liu, C.-D. Wang, D. Huang, and J.-H. Lai, “Representation learning in multi-view clustering: A literature review,”Data Science and Engineering, vol. 7, no. 3, pp. 225–241, 2022

work page 2022
[50]

An information- maximization approach to blind separation and blind deconvolution,

A. J. Bell and T. J. Sejnowski, “An information- maximization approach to blind separation and blind deconvolution,”Neural computation, vol. 7, no. 6, pp. 1129–1159, 1995

work page 1995
[51]

Learning deep representations by mutual information estimation and maximization

R. D. Hjelm, A. Fedorov, S. Lavoie-Marchildon, K. Gre- wal, P . Bachman, A. Trischler, and Y. Bengio, “Learning deep representations by mutual information estimation and maximization,”arXiv preprint arXiv:1808.06670, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[52]

Learn- ing representations by maximizing mutual information across views,

P . Bachman, R. D. Hjelm, and W. Buchwalter, “Learn- ing representations by maximizing mutual information across views,”Advances in neural information processing systems, vol. 32, 2019

work page 2019
[53]

Deep mutual information maximin for cross-modal clustering,

Y. Mao, X. Yan, Q. Guo, and Y. Ye, “Deep mutual information maximin for cross-modal clustering,” in Proceedings of the AAAI Conference on Artificial Intelli- gence, vol. 35, no. 10, 2021, pp. 8893–8901

work page 2021
[54]

Multi-view clustering via triplex information maximization,

C. Zhang, Z. Lou, Q. Zhou, and S. Hu, “Multi-view clustering via triplex information maximization,”IEEE Transactions on Image Processing, 2023

work page 2023
[55]

De- coupled contrastive multi-view clustering with high- order random walks,

Y. Lu, Y. Lin, M. Yang, D. Peng, P . Hu, and X. Peng, “De- coupled contrastive multi-view clustering with high- order random walks,” inProceedings of the AAAI Con- ference on Artificial Intelligence, vol. 38, no. 13, 2024, pp. 14 193–14 201

work page 2024
[56]

Mcoco: Multi-level consistency collaborative multi- view clustering,

Y. Zhou, Q. Zheng, Y. Wang, W. Yan, P . Shi, and J. Zhu, “Mcoco: Multi-level consistency collaborative multi- view clustering,”Expert Systems with Applications, vol. 238, p. 121976, 2024

work page 2024
[57]

beta- vae: Learning basic visual concepts with a constrained variational framework

I. Higgins, L. Matthey, A. Pal, C. P . Burgess, X. Glorot, M. M. Botvinick, S. Mohamed, and A. Lerchner, “beta- vae: Learning basic visual concepts with a constrained variational framework.”ICLR (Poster), vol. 3, 2017

work page 2017
[58]

Understanding disentangling in $\beta$-VAE

C. P . Burgess, I. Higgins, A. Pal, L. Matthey, N. Watters, G. Desjardins, and A. Lerchner, “Understanding dis- entangling inβ-VAE,”arXiv preprint arXiv:1804.03599, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[59]

Challenging common assumptions in the unsupervised learning of disen- tangled representations,

F. Locatello, S. Bauer, M. Lucic, G. Raetsch, S. Gelly, B. Sch ¨olkopf, and O. Bachem, “Challenging common assumptions in the unsupervised learning of disen- tangled representations,” ininternational conference on machine learning. PMLR, 2019, pp. 4114–4124

work page 2019
[60]

Infogan: Interpretable rep- IEEE TRANSACTIONS ON PATTERN ANAL YSIS AND MACHINE INTELLIGENCE 15 resentation learning by information maximizing gener- ative adversarial nets,

X. Chen, Y. Duan, R. Houthooft, J. Schulman, I. Sutskever, and P . Abbeel, “Infogan: Interpretable rep- IEEE TRANSACTIONS ON PATTERN ANAL YSIS AND MACHINE INTELLIGENCE 15 resentation learning by information maximizing gener- ative adversarial nets,”Advances in neural information processing systems, vol. 29, 2016

work page 2016
[61]

Causalvae: Disentangled representation learning via neural structural causal models,

M. Yang, F. Liu, Z. Chen, X. Shen, J. Hao, and J. Wang, “Causalvae: Disentangled representation learning via neural structural causal models,” inProceedings of the IEEE/CVF conference on computer vision and pattern recog- nition, 2021, pp. 9593–9602

work page 2021
[62]

High-fidelity synthesis with disentangled representation,

W. Lee, D. Kim, S. Hong, and H. Lee, “High-fidelity synthesis with disentangled representation,” inCom- puter Vision–ECCV 2020: 16th European Conference, Glas- gow, UK, August 23–28, 2020, Proceedings, Part XXVI 16. Springer, 2020, pp. 157–174

work page 2020
[63]

Vmi-vae: Variational mu- tual information maximization framework for vae with discrete and continuous priors,

A. Serdega and D.-S. Kim, “Vmi-vae: Variational mu- tual information maximization framework for vae with discrete and continuous priors,”arXiv preprint arXiv:2005.13953, 2020

work page arXiv 2005
[64]

Debias- ing graph neural networks via learning disentangled causal substructure,

S. Fan, X. Wang, Y. Mo, C. Shi, and J. Tang, “Debias- ing graph neural networks via learning disentangled causal substructure,”Advances in Neural Information Processing Systems, vol. 35, pp. 24 934–24 946, 2022

work page 2022
[65]

Multi-level feature learning for contrastive multi-view clustering,

J. Xu, H. Tang, Y. Ren, L. Peng, X. Zhu, and L. He, “Multi-level feature learning for contrastive multi-view clustering,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 16 051– 16 060

work page 2022
[66]

Reducing the dimensionality of data with neural networks,

G. E. Hinton and R. R. Salakhutdinov, “Reducing the dimensionality of data with neural networks,”science, vol. 313, no. 5786, pp. 504–507, 2006

work page 2006
[67]

Deep spectral clustering using dual autoencoder network,

X. Yang, C. Deng, F. Zheng, J. Yan, and W. Liu, “Deep spectral clustering using dual autoencoder network,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 4066–4075

work page 2019
[68]

Adaptive graph auto- encoder for general data clustering,

X. Li, H. Zhang, and R. Zhang, “Adaptive graph auto- encoder for general data clustering,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 12, pp. 9725–9732, 2021

work page 2021
[69]

Trustworthy multi-view clustering via alternat- ing generative adversarial representation learning and fusion,

W. Yang, M. Wang, C. Tang, X. Zheng, X. Liu, and K. He, “Trustworthy multi-view clustering via alternat- ing generative adversarial representation learning and fusion,”Information Fusion, vol. 107, p. 102323, 2024

work page 2024
[70]

Zeronas: Differentiable generative adver- sarial networks search for zero-shot learning,

C. Yan, X. Chang, Z. Li, W. Guan, Z. Ge, L. Zhu, and Q. Zheng, “Zeronas: Differentiable generative adver- sarial networks search for zero-shot learning,”IEEE transactions on pattern analysis and machine intelligence, vol. 44, no. 12, pp. 9733–9740, 2021

work page 2021
[71]

Representation Learning with Contrastive Predictive Coding

A. v. d. Oord, Y. Li, and O. Vinyals, “Representa- tion learning with contrastive predictive coding,”arXiv preprint arXiv:1807.03748, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[72]

Mutual information-driven multi-view clustering,

L. Zhang, L. Fu, T. Wang, C. Chen, and C. Zhang, “Mutual information-driven multi-view clustering,” in Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, 2023, pp. 3268– 3277

work page 2023
[73]

Disentangled multiplex graph representation learn- ing,

Y. Mo, Y. Lei, J. Shen, X. Shi, H. T. Shen, and X. Zhu, “Disentangled multiplex graph representation learn- ing,” inInternational Conference on Machine Learning. PMLR, 2023, pp. 24 983–25 005

work page 2023
[74]

Dual contrastive prediction for incomplete multi-view representation learning,

Y. Lin, Y. Gou, X. Liu, J. Bai, J. Lv, and X. Peng, “Dual contrastive prediction for incomplete multi-view representation learning,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 4, pp. 4447–4461, 2022

work page 2022
[75]

Deep safe multi-view clustering: Reducing the risk of clustering performance degrada- tion caused by view increase,

H. Tang and Y. Liu, “Deep safe multi-view clustering: Reducing the risk of clustering performance degrada- tion caused by view increase,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 202–211

work page 2022
[76]

Robust multi-view clustering with incomplete information,

M. Yang, Y. Li, P . Hu, J. Bai, J. Lv, and X. Peng, “Robust multi-view clustering with incomplete information,” IEEE Transactions on Pattern Analysis and Machine In- telligence, vol. 45, no. 1, pp. 1055–1069, 2022

work page 2022
[77]

Dealmvc: Dual contrastive calibration for multi-view clustering,

X. Yang, J. Jiaqi, S. Wang, K. Liang, Y. Liu, Y. Wen, S. Liu, S. Zhou, X. Liu, and E. Zhu, “Dealmvc: Dual contrastive calibration for multi-view clustering,” in Proceedings of the 31st ACM International Conference on Multimedia, 2023, pp. 337–346

work page 2023
[78]

Deep incomplete multi-view clustering with cross-view par- tial sample and prototype alignment,

J. Jin, S. Wang, Z. Dong, X. Liu, and E. Zhu, “Deep incomplete multi-view clustering with cross-view par- tial sample and prototype alignment,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2023, pp. 11 600–11 609

work page 2023
[79]

Self-supervised discriminative feature learning for deep multi-view clustering,

J. Xu, Y. Ren, H. Tang, Z. Yang, L. Pan, Y. Yang, X. Pu, S. Y. Philip, and L. He, “Self-supervised discriminative feature learning for deep multi-view clustering,”IEEE Transactions on Knowledge and Data Engineering, vol. 35, no. 7, pp. 7470–7482, 2023

work page 2023
[80]

Deep multi- view clustering by contrasting cluster assignments,

J. Chen, H. Mao, W. L. Woo, and X. Peng, “Deep multi- view clustering by contrasting cluster assignments,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 16 752–16 761

work page 2023
[81]

Incomplete multi-view clustering via diffusion contrastive generation,

Y. Zhang, Y. Lin, W. Yan, L. Yao, X. Wan, G. Li, C. Zhang, G. Ke, and J. Xu, “Incomplete multi-view clustering via diffusion contrastive generation,” inPro- ceedings of the AAAI Conference on Artificial Intelligence, vol. 39, no. 21, 2025, pp. 22 650–22 658

work page 2025

Showing first 80 references.

[1] [1]

Multi-view discriminant analysis,

M. Kan, S. Shan, H. Zhang, S. Lao, and X. Chen, “Multi-view discriminant analysis,”IEEE transactions on pattern analysis and machine intelligence, vol. 38, no. 1, pp. 188–194, 2015

work page 2015

[2] [2]

An information theo- retic framework for multi-view learning,

K. Sridharan and S. M. Kakade, “An information theo- retic framework for multi-view learning,” inCOLT, no. 114, 2008, pp. 403–414

work page 2008

[3] [3]

A comprehensive survey on multi-view clustering,

U. Fang, M. Li, J. Li, L. Gao, T. Jia, and Y. Zhang, “A comprehensive survey on multi-view clustering,”IEEE IEEE TRANSACTIONS ON PATTERN ANAL YSIS AND MACHINE INTELLIGENCE 13 Transactions on Knowledge and Data Engineering, vol. 35, no. 12, pp. 12 350–12 368, 2023

work page 2023

[4] [4]

Multi-view unsuper- vised user feature embedding for social media-based substance use prediction,

T. Ding, W. K. Bickel, and S. Pan, “Multi-view unsuper- vised user feature embedding for social media-based substance use prediction,” inProceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017, pp. 2275–2284

work page 2017

[5] [5]

Multi-omic data analysis using galaxy,

J. Boekel, J. M. Chilton, I. R. Cooke, P . L. Horvatovich, P . D. Jagtap, L. K ¨all, J. Lehti ¨o, P . Lukasse, P . D. Moer- land, and T. J. Griffin, “Multi-omic data analysis using galaxy,”Nature biotechnology, vol. 33, no. 2, pp. 137–139, 2015

work page 2015

[6] [6]

Hierarchical attention learning for multimodal classi- fication,

X. Zou, C. Tang, W. Zhang, K. Sun, and L. Jiang, “Hierarchical attention learning for multimodal classi- fication,” in2023 IEEE International Conference on Mul- timedia and Expo (ICME). IEEE, 2023, pp. 936–941

work page 2023

[7] [7]

Dpnet: Dynamic poly-attention network for trustworthy multi-modal classification,

X. Zou, C. Tang, X. Zheng, Z. Li, X. He, S. An, and X. Liu, “Dpnet: Dynamic poly-attention network for trustworthy multi-modal classification,” inProceedings of the 31st ACM International Conference on Multimedia, 2023, pp. 3550–3559

work page 2023

[8] [8]

Dai-net: Dual adaptive interaction network for coordinated medication recommendation,

X. Zou, X. He, X. Zheng, W. Zhang, J. Chen, and C. Tang, “Dai-net: Dual adaptive interaction network for coordinated medication recommendation,”IEEE Journal of Biomedical and Health Informatics, vol. 28, pp. 6201–6211, 2024

work page 2024

[9] [9]

Modality-aware mutual learning for multi- modal medical image segmentation,

Y. Zhang, J. Yang, J. Tian, Z. Shi, C. Zhong, Y. Zhang, and Z. He, “Modality-aware mutual learning for multi- modal medical image segmentation,” inMedical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part I 24. Springer, 2021, pp. 589–599

work page 2021

[10] [10]

Reconsidering representation alignment for multi- view clustering,

D. J. Trosten, S. Lokse, R. Jenssen, and M. Kampffmeyer, “Reconsidering representation alignment for multi- view clustering,” inProceedings of the IEEE/CVF confer- ence on computer vision and pattern recognition, 2021, pp. 1255–1265

work page 2021

[11] [11]

Consensus graph learning for multi-view clustering,

Z. Li, C. Tang, X. Liu, X. Zheng, W. Zhang, and E. Zhu, “Consensus graph learning for multi-view clustering,” IEEE Transactions on Multimedia, vol. 24, pp. 2461–2472, 2021

work page 2021

[12] [12]

Adaptive feature projection with distribution alignment for deep incomplete multi-view clustering,

J. Xu, C. Li, L. Peng, Y. Ren, X. Shi, H. T. Shen, and X. Zhu, “Adaptive feature projection with distribution alignment for deep incomplete multi-view clustering,” IEEE Transactions on Image Processing, vol. 32, pp. 1354– 1366, 2023

work page 2023

[13] [13]

From concrete to abstract: Multi-view clustering on relational knowledge,

K. Liang, L. Meng, H. Li, J. Wang, L. Lan, M. Li, X. Liu, and H. Wang, “From concrete to abstract: Multi-view clustering on relational knowledge,”IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1–18, 2025

work page 2025

[14] [14]

One-pass multi-view clustering for large- scale data,

J. Liu, X. Liu, Y. Yang, L. Liu, S. Wang, W. Liang, and J. Shi, “One-pass multi-view clustering for large- scale data,” inProceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 12 344–12 353

work page 2021

[15] [15]

Orthogo- nal non-negative tensor factorization based multi-view clustering,

J. Li, Q. Gao, Q. Wang, M. Yang, and W. Xia, “Orthogo- nal non-negative tensor factorization based multi-view clustering,”Advances in Neural Information Processing Systems, vol. 36, 2024

work page 2024

[16] [16]

A survey of knowl- edge graph reasoning on graph types: Static, dynamic, and multi-modal,

K. Liang, L. Meng, M. Liu, Y. Liu, W. Tu, S. Wang, S. Zhou, X. Liu, F. Sun, and K. He, “A survey of knowl- edge graph reasoning on graph types: Static, dynamic, and multi-modal,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 46, no. 12, pp. 9456–9478, 2024

work page 2024

[17] [17]

Gmc: Graph-based multi-view clustering,

H. Wang, Y. Yang, and B. Liu, “Gmc: Graph-based multi-view clustering,”IEEE Transactions on Knowledge and Data Engineering, vol. 32, no. 6, pp. 1116–1129, 2019

work page 2019

[18] [18]

Inclusivity induced adaptive graph learning for multi-view clustering,

X. Zou, C. Tang, X. Zheng, K. Sun, W. Zhang, and D. Ding, “Inclusivity induced adaptive graph learning for multi-view clustering,”Knowledge-Based Systems, vol. 267, p. 110424, 2023

work page 2023

[19] [19]

Multi-view contrastive graph clustering,

E. Pan and Z. Kang, “Multi-view contrastive graph clustering,”Advances in neural information processing systems, vol. 34, pp. 2148–2159, 2021

work page 2021

[20] [20]

Unified one-step multi-view spectral clustering,

C. Tang, Z. Li, J. Wang, X. Liu, W. Zhang, and E. Zhu, “Unified one-step multi-view spectral clustering,”IEEE Transactions on Knowledge and Data Engineering, vol. 35, no. 6, pp. 6449–6460, 2022

work page 2022

[21] [21]

Diversity-induced multi-view subspace clustering,

X. Cao, C. Zhang, H. Fu, S. Liu, and H. Zhang, “Diversity-induced multi-view subspace clustering,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 586–594

work page 2015

[22] [22]

Multi-view sub- space clustering,

H. Gao, F. Nie, X. Li, and H. Huang, “Multi-view sub- space clustering,” inProceedings of the IEEE international conference on computer vision, 2015, pp. 4238–4246

work page 2015

[23] [23]

Generalized latent multi-view subspace clus- tering,

C. Zhang, H. Fu, Q. Hu, X. Cao, Y. Xie, D. Tao, and D. Xu, “Generalized latent multi-view subspace clus- tering,”IEEE transactions on pattern analysis and machine intelligence, vol. 42, no. 1, pp. 86–99, 2018

work page 2018

[24] [24]

A survey on multiview clustering,

G. Chao, S. Sun, and J. Bi, “A survey on multiview clustering,”IEEE transactions on artificial intelligence, vol. 2, no. 2, pp. 146–168, 2021

work page 2021

[25] [25]

Deep adversarial multi-view clustering network

Z. Li, Q. Wang, Z. Tao, Q. Gao, Z. Yanget al., “Deep adversarial multi-view clustering network.” inIJCAI, vol. 2, no. 3, 2019, p. 4

work page 2019

[26] [26]

Deep safe incomplete multi-view clustering: Theorem and algorithm,

H. Tang and Y. Liu, “Deep safe incomplete multi-view clustering: Theorem and algorithm,” inInternational Conference on Machine Learning. PMLR, 2022, pp. 21 090–21 110

work page 2022

[27] [27]

Dual alignment feature embedding network for multi- omics data clustering,

Y. Xiao, D. Yang, J. Li, X. Zou, H. Zhou, and C. Tang, “Dual alignment feature embedding network for multi- omics data clustering,”Knowledge-Based Systems, vol. 309, p. 112774, 2025

work page 2025

[28] [28]

On the effects of self-supervision and contrastive alignment in deep multi-view clustering,

D. J. Trosten, S. Løkse, R. Jenssen, and M. C. Kampffmeyer, “On the effects of self-supervision and contrastive alignment in deep multi-view clustering,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2023, pp. 23 976–23 985

work page 2023

[29] [29]

A novel approach for effective multi-view clustering with information-theoretic perspective,

C. Cui, Y. Ren, J. Pu, J. Li, X. Pu, T. Wu, Y. Shi, and L. He, “A novel approach for effective multi-view clustering with information-theoretic perspective,”Advances in Neural Information Processing Systems, vol. 36, 2024

work page 2024

[30] [30]

Trusted mamba contrastive network for multi-view clustering,

J. Zhu, X. Zou, L. Liu, Z. Huang, Y. Zhang, C. Tang, and L.-R. Dai, “Trusted mamba contrastive network for multi-view clustering,”arXiv preprint arXiv:2412.16487, 2024

work page arXiv 2024

[31] [31]

Rethinking multi-view representation learning via distilled dis- entangling,

G. Ke, B. Wang, X. Wang, and S. He, “Rethinking multi-view representation learning via distilled dis- entangling,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. IEEE TRANSACTIONS ON PATTERN ANAL YSIS AND MACHINE INTELLIGENCE 14 26 774–26 783

work page 2024

[32] [33]

Com- pleter: Incomplete multi-view clustering via contrastive prediction,

Y. Lin, Y. Gou, Z. Liu, B. Li, J. Lv, and X. Peng, “Com- pleter: Incomplete multi-view clustering via contrastive prediction,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 11 174– 11 183

work page 2021

[33] [34]

Gcfagg: Global and cross-view feature aggregation for multi-view clustering,

W. Yan, Y. Zhang, C. Lv, C. Tang, G. Yue, L. Liao, and W. Lin, “Gcfagg: Global and cross-view feature aggregation for multi-view clustering,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 19 863–19 872

work page 2023

[34] [35]

Self-weighted contrastive fusion for deep multi-view clustering,

S. Wu, Y. Zheng, Y. Ren, J. He, X. Pu, S. Huang, Z. Hao, and L. He, “Self-weighted contrastive fusion for deep multi-view clustering,”IEEE Transactions on Multimedia, 2024

work page 2024

[35] [36]

Investigating and mitigating the side effects of noisy views for self-supervised clustering algorithms in practical multi-view scenarios,

J. Xu, Y. Ren, X. Wang, L. Feng, Z. Zhang, G. Niu, and X. Zhu, “Investigating and mitigating the side effects of noisy views for self-supervised clustering algorithms in practical multi-view scenarios,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 22 957–22 966

work page 2024

[36] [37]

Visualizing data using t-sne

L. Van der Maaten and G. Hinton, “Visualizing data using t-sne.”Journal of machine learning research, vol. 9, no. 11, 2008

work page 2008

[37] [38]

Autoencoders, minimum description length and helmholtz free energy,

G. E. Hinton and R. Zemel, “Autoencoders, minimum description length and helmholtz free energy,”Ad- vances in neural information processing systems, vol. 6, 1993

work page 1993

[38] [39]

Unsupervised deep embedding for clustering analysis,

J. Xie, R. Girshick, and A. Farhadi, “Unsupervised deep embedding for clustering analysis,” inInternational con- ference on machine learning. PMLR, 2016, pp. 478–487

work page 2016

[39] [40]

Improved deep em- bedded clustering with local structure preservation

X. Guo, L. Gao, X. Liu, and J. Yin, “Improved deep em- bedded clustering with local structure preservation.” in Ijcai, vol. 17, 2017, pp. 1753–1759

work page 2017

[40] [41]

Variational Deep Embedding: An Unsupervised and Generative Approach to Clustering

Z. Jiang, Y. Zheng, H. Tan, B. Tang, and H. Zhou, “Variational deep embedding: An unsupervised and generative approach to clustering,”arXiv preprint arXiv:1611.05148, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016

[41] [42]

Deep Unsupervised Clustering with Gaussian Mixture Variational Autoencoders

N. Dilokthanakul, P . A. Mediano, M. Garnelo, M. C. Lee, H. Salimbeni, K. Arulkumaran, and M. Shana- han, “Deep unsupervised clustering with gaussian mixture variational autoencoders,”arXiv preprint arXiv:1611.02648, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016

[42] [43]

Multi-vae: Learning disentangled view- common and view-peculiar visual representations for multi-view clustering,

J. Xu, Y. Ren, H. Tang, X. Pu, X. Zhu, M. Zeng, and L. He, “Multi-vae: Learning disentangled view- common and view-peculiar visual representations for multi-view clustering,” inProceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 9234–9243

work page 2021

[43] [44]

Generative adversarial nets,

I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,”Advances in neural infor- mation processing systems, vol. 27, 2014

work page 2014

[44] [45]

Generative adversarial networks,

——, “Generative adversarial networks,”Communica- tions of the ACM, vol. 63, no. 11, pp. 139–144, 2020

work page 2020

[45] [46]

Intriguing properties of synthetic im- ages: from generative adversarial networks to diffusion models,

R. Corvi, D. Cozzolino, G. Poggi, K. Nagano, and L. Verdoliva, “Intriguing properties of synthetic im- ages: from generative adversarial networks to diffusion models,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 973– 982

work page 2023

[46] [47]

Dual adversarial autoencoders for clustering,

P . Ge, C.-X. Ren, D.-Q. Dai, J. Feng, and S. Yan, “Dual adversarial autoencoders for clustering,”IEEE trans- actions on neural networks and learning systems, vol. 31, no. 4, pp. 1417–1424, 2019

work page 2019

[47] [48]

Sparsemvc: Probing cross-view sparsity variations for multi-view clustering,

R. Liu, X. Zou, C. Tang, X. Zheng, X. Hu, K. Sun, and X. Liu, “Sparsemvc: Probing cross-view sparsity variations for multi-view clustering,” inThe Thirty- ninth Annual Conference on Neural Information Processing Systems, 2025

work page 2025

[48] [49]

Representation learning in multi-view clustering: A literature review,

M.-S. Chen, J.-Q. Lin, X.-L. Li, B.-Y. Liu, C.-D. Wang, D. Huang, and J.-H. Lai, “Representation learning in multi-view clustering: A literature review,”Data Science and Engineering, vol. 7, no. 3, pp. 225–241, 2022

work page 2022

[49] [50]

An information- maximization approach to blind separation and blind deconvolution,

A. J. Bell and T. J. Sejnowski, “An information- maximization approach to blind separation and blind deconvolution,”Neural computation, vol. 7, no. 6, pp. 1129–1159, 1995

work page 1995

[50] [51]

Learning deep representations by mutual information estimation and maximization

R. D. Hjelm, A. Fedorov, S. Lavoie-Marchildon, K. Gre- wal, P . Bachman, A. Trischler, and Y. Bengio, “Learning deep representations by mutual information estimation and maximization,”arXiv preprint arXiv:1808.06670, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[51] [52]

Learn- ing representations by maximizing mutual information across views,

P . Bachman, R. D. Hjelm, and W. Buchwalter, “Learn- ing representations by maximizing mutual information across views,”Advances in neural information processing systems, vol. 32, 2019

work page 2019

[52] [53]

Deep mutual information maximin for cross-modal clustering,

Y. Mao, X. Yan, Q. Guo, and Y. Ye, “Deep mutual information maximin for cross-modal clustering,” in Proceedings of the AAAI Conference on Artificial Intelli- gence, vol. 35, no. 10, 2021, pp. 8893–8901

work page 2021

[53] [54]

Multi-view clustering via triplex information maximization,

C. Zhang, Z. Lou, Q. Zhou, and S. Hu, “Multi-view clustering via triplex information maximization,”IEEE Transactions on Image Processing, 2023

work page 2023

[54] [55]

De- coupled contrastive multi-view clustering with high- order random walks,

Y. Lu, Y. Lin, M. Yang, D. Peng, P . Hu, and X. Peng, “De- coupled contrastive multi-view clustering with high- order random walks,” inProceedings of the AAAI Con- ference on Artificial Intelligence, vol. 38, no. 13, 2024, pp. 14 193–14 201

work page 2024

[55] [56]

Mcoco: Multi-level consistency collaborative multi- view clustering,

Y. Zhou, Q. Zheng, Y. Wang, W. Yan, P . Shi, and J. Zhu, “Mcoco: Multi-level consistency collaborative multi- view clustering,”Expert Systems with Applications, vol. 238, p. 121976, 2024

work page 2024

[56] [57]

beta- vae: Learning basic visual concepts with a constrained variational framework

I. Higgins, L. Matthey, A. Pal, C. P . Burgess, X. Glorot, M. M. Botvinick, S. Mohamed, and A. Lerchner, “beta- vae: Learning basic visual concepts with a constrained variational framework.”ICLR (Poster), vol. 3, 2017

work page 2017

[57] [58]

Understanding disentangling in $\beta$-VAE

C. P . Burgess, I. Higgins, A. Pal, L. Matthey, N. Watters, G. Desjardins, and A. Lerchner, “Understanding dis- entangling inβ-VAE,”arXiv preprint arXiv:1804.03599, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[58] [59]

Challenging common assumptions in the unsupervised learning of disen- tangled representations,

F. Locatello, S. Bauer, M. Lucic, G. Raetsch, S. Gelly, B. Sch ¨olkopf, and O. Bachem, “Challenging common assumptions in the unsupervised learning of disen- tangled representations,” ininternational conference on machine learning. PMLR, 2019, pp. 4114–4124

work page 2019

[59] [60]

Infogan: Interpretable rep- IEEE TRANSACTIONS ON PATTERN ANAL YSIS AND MACHINE INTELLIGENCE 15 resentation learning by information maximizing gener- ative adversarial nets,

X. Chen, Y. Duan, R. Houthooft, J. Schulman, I. Sutskever, and P . Abbeel, “Infogan: Interpretable rep- IEEE TRANSACTIONS ON PATTERN ANAL YSIS AND MACHINE INTELLIGENCE 15 resentation learning by information maximizing gener- ative adversarial nets,”Advances in neural information processing systems, vol. 29, 2016

work page 2016

[60] [61]

Causalvae: Disentangled representation learning via neural structural causal models,

M. Yang, F. Liu, Z. Chen, X. Shen, J. Hao, and J. Wang, “Causalvae: Disentangled representation learning via neural structural causal models,” inProceedings of the IEEE/CVF conference on computer vision and pattern recog- nition, 2021, pp. 9593–9602

work page 2021

[61] [62]

High-fidelity synthesis with disentangled representation,

W. Lee, D. Kim, S. Hong, and H. Lee, “High-fidelity synthesis with disentangled representation,” inCom- puter Vision–ECCV 2020: 16th European Conference, Glas- gow, UK, August 23–28, 2020, Proceedings, Part XXVI 16. Springer, 2020, pp. 157–174

work page 2020

[62] [63]

Vmi-vae: Variational mu- tual information maximization framework for vae with discrete and continuous priors,

A. Serdega and D.-S. Kim, “Vmi-vae: Variational mu- tual information maximization framework for vae with discrete and continuous priors,”arXiv preprint arXiv:2005.13953, 2020

work page arXiv 2005

[63] [64]

Debias- ing graph neural networks via learning disentangled causal substructure,

S. Fan, X. Wang, Y. Mo, C. Shi, and J. Tang, “Debias- ing graph neural networks via learning disentangled causal substructure,”Advances in Neural Information Processing Systems, vol. 35, pp. 24 934–24 946, 2022

work page 2022

[64] [65]

Multi-level feature learning for contrastive multi-view clustering,

J. Xu, H. Tang, Y. Ren, L. Peng, X. Zhu, and L. He, “Multi-level feature learning for contrastive multi-view clustering,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 16 051– 16 060

work page 2022

[65] [66]

Reducing the dimensionality of data with neural networks,

G. E. Hinton and R. R. Salakhutdinov, “Reducing the dimensionality of data with neural networks,”science, vol. 313, no. 5786, pp. 504–507, 2006

work page 2006

[66] [67]

Deep spectral clustering using dual autoencoder network,

X. Yang, C. Deng, F. Zheng, J. Yan, and W. Liu, “Deep spectral clustering using dual autoencoder network,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 4066–4075

work page 2019

[67] [68]

Adaptive graph auto- encoder for general data clustering,

X. Li, H. Zhang, and R. Zhang, “Adaptive graph auto- encoder for general data clustering,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 12, pp. 9725–9732, 2021

work page 2021

[68] [69]

Trustworthy multi-view clustering via alternat- ing generative adversarial representation learning and fusion,

W. Yang, M. Wang, C. Tang, X. Zheng, X. Liu, and K. He, “Trustworthy multi-view clustering via alternat- ing generative adversarial representation learning and fusion,”Information Fusion, vol. 107, p. 102323, 2024

work page 2024

[69] [70]

Zeronas: Differentiable generative adver- sarial networks search for zero-shot learning,

C. Yan, X. Chang, Z. Li, W. Guan, Z. Ge, L. Zhu, and Q. Zheng, “Zeronas: Differentiable generative adver- sarial networks search for zero-shot learning,”IEEE transactions on pattern analysis and machine intelligence, vol. 44, no. 12, pp. 9733–9740, 2021

work page 2021

[70] [71]

Representation Learning with Contrastive Predictive Coding

A. v. d. Oord, Y. Li, and O. Vinyals, “Representa- tion learning with contrastive predictive coding,”arXiv preprint arXiv:1807.03748, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[71] [72]

Mutual information-driven multi-view clustering,

L. Zhang, L. Fu, T. Wang, C. Chen, and C. Zhang, “Mutual information-driven multi-view clustering,” in Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, 2023, pp. 3268– 3277

work page 2023

[72] [73]

Disentangled multiplex graph representation learn- ing,

Y. Mo, Y. Lei, J. Shen, X. Shi, H. T. Shen, and X. Zhu, “Disentangled multiplex graph representation learn- ing,” inInternational Conference on Machine Learning. PMLR, 2023, pp. 24 983–25 005

work page 2023

[73] [74]

Dual contrastive prediction for incomplete multi-view representation learning,

Y. Lin, Y. Gou, X. Liu, J. Bai, J. Lv, and X. Peng, “Dual contrastive prediction for incomplete multi-view representation learning,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 4, pp. 4447–4461, 2022

work page 2022

[74] [75]

Deep safe multi-view clustering: Reducing the risk of clustering performance degrada- tion caused by view increase,

H. Tang and Y. Liu, “Deep safe multi-view clustering: Reducing the risk of clustering performance degrada- tion caused by view increase,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 202–211

work page 2022

[75] [76]

Robust multi-view clustering with incomplete information,

M. Yang, Y. Li, P . Hu, J. Bai, J. Lv, and X. Peng, “Robust multi-view clustering with incomplete information,” IEEE Transactions on Pattern Analysis and Machine In- telligence, vol. 45, no. 1, pp. 1055–1069, 2022

work page 2022

[76] [77]

Dealmvc: Dual contrastive calibration for multi-view clustering,

X. Yang, J. Jiaqi, S. Wang, K. Liang, Y. Liu, Y. Wen, S. Liu, S. Zhou, X. Liu, and E. Zhu, “Dealmvc: Dual contrastive calibration for multi-view clustering,” in Proceedings of the 31st ACM International Conference on Multimedia, 2023, pp. 337–346

work page 2023

[77] [78]

Deep incomplete multi-view clustering with cross-view par- tial sample and prototype alignment,

J. Jin, S. Wang, Z. Dong, X. Liu, and E. Zhu, “Deep incomplete multi-view clustering with cross-view par- tial sample and prototype alignment,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2023, pp. 11 600–11 609

work page 2023

[78] [79]

Self-supervised discriminative feature learning for deep multi-view clustering,

J. Xu, Y. Ren, H. Tang, Z. Yang, L. Pan, Y. Yang, X. Pu, S. Y. Philip, and L. He, “Self-supervised discriminative feature learning for deep multi-view clustering,”IEEE Transactions on Knowledge and Data Engineering, vol. 35, no. 7, pp. 7470–7482, 2023

work page 2023

[79] [80]

Deep multi- view clustering by contrasting cluster assignments,

J. Chen, H. Mao, W. L. Woo, and X. Peng, “Deep multi- view clustering by contrasting cluster assignments,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 16 752–16 761

work page 2023

[80] [81]

Incomplete multi-view clustering via diffusion contrastive generation,

Y. Zhang, Y. Lin, W. Yan, L. Yao, X. Wan, G. Li, C. Zhang, G. Ke, and J. Xu, “Incomplete multi-view clustering via diffusion contrastive generation,” inPro- ceedings of the AAAI Conference on Artificial Intelligence, vol. 39, no. 21, 2025, pp. 22 650–22 658

work page 2025