pith. sign in

arxiv: 2604.09064 · v1 · submitted 2026-04-10 · 💻 cs.LG

Feature-Label Modal Alignment for Robust Partial Multi-Label Learning

Pith reviewed 2026-05-10 17:26 UTC · model grok-4.3

classification 💻 cs.LG
keywords partial multi-label learningmodal alignmentpseudo-label generationnoise robustnesslow-rank decompositionclass prototype learningfeature-label consistency
0
0 comments X

The pith

Feature-label modal alignment restores consistency to improve robust partial multi-label classification

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

In partial multi-label learning each instance carries a bag of candidate labels that mixes correct and noisy entries, which breaks the expected correspondence between input features and output labels. The paper claims that treating features and labels as two complementary modalities and restoring their alignment can recover pseudo-labels closer to the true distribution. It first applies low-rank orthogonal decomposition to filter noise into cleaner pseudo-labels, then aligns the original features with these pseudo-labels by a global projection into a shared subspace plus local preservation of neighborhood relations, and finally sharpens discriminability with a multi-peak prototype learner that treats the pseudo-labels as soft membership weights. A sympathetic reader would care because partial multi-label data arises in many practical tagging and annotation tasks where exhaustive clean labeling is expensive, so a method that tolerates noise while still producing usable classifiers would widen the range of deployable applications.

Core claim

PML-MA generates pseudo-labels via low-rank orthogonal decomposition to approximate the true label distribution, aligns features and pseudo-labels through both global projection into a common subspace and local preservation of neighborhood structures, and applies multi-peak class prototype learning that uses the pseudo-labels as soft weights to increase class separability, thereby restoring feature-label consistency and delivering higher accuracy under label noise.

What carries the argument

Feature-label modal alignment, which combines low-rank orthogonal decomposition for pseudo-label generation with global subspace projection and local neighborhood preservation to enforce consistency between the feature and label modalities.

If this is right

  • Classification accuracy rises on both real-world and synthetic partial multi-label datasets compared with prior methods.
  • Performance remains higher as the fraction of noisy labels in the candidate sets increases.
  • The generated pseudo-labels more closely match the underlying true label distributions.
  • Multi-label instances gain better discriminability through the prototype refinement that exploits soft membership weights.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same global-local alignment pattern might transfer to other noisy or incomplete label settings such as noisy single-label classification or missing-label multi-label problems.
  • An adaptive procedure for selecting the decomposition rank could reduce sensitivity to this hyper-parameter without changing the core alignment logic.
  • Because local structure is explicitly preserved, the method may combine naturally with graph-based regularizers in future extensions.

Load-bearing premise

Low-rank orthogonal decomposition together with the chosen global and local alignments will recover pseudo-labels close to the true distribution without systematic bias from the selected rank or projection matrices.

What would settle it

On a synthetic dataset with known ground-truth labels and controlled noise rates, measure whether the pseudo-labels produced by the decomposition-plus-alignment steps show lower Hamming loss than those from simple frequency-based or thresholding baselines; if they do not, the recovery claim fails.

Figures

Figures reproduced from arXiv: 2604.09064 by Guanbin Li, Jie Wen, Weijun Lv, Xiaozhao Fang, Yong Xu, Yu Chen, Yue Huang.

Figure 1
Figure 1. Figure 1: An example of partial multi-label learning. The candidate label set [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overview of the proposed PML-MA framework. The method comprises: (1) Label Low-Rank Orthogonal decomposition ( [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Results of PML-MA against other approaches with the Nemenyi test(CD = 2.1934 at 0.05 significance level). [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Results of PML-MA with varying values of trade-off parameters on [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: The convergence curves of PML-MA on the synthetic datasets. [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Comparative experiment of low rank orthogonal decomposition and [PITH_FULL_IMAGE:figures/full_fig_p011_6.png] view at source ↗
read the original abstract

In partial multi-label learning (PML), each instance is associated with a set of candidate labels containing both ground-truth and noisy labels. The presence of noisy labels disrupts the correspondence between features and labels, degrading classification performance. To address this challenge, we propose a novel PML method based on feature-label modal alignment (PML-MA), which treats features and labels as two complementary modalities and restores their consistency through systematic alignment. Specifically, PML-MA first employs low-rank orthogonal decomposition to generate pseudo-labels that approximate the true label distribution by filtering noisy labels. It then aligns features and pseudo-labels through both global projection into a common subspace and local preservation of neighborhood structures. Finally, a multi-peak class prototype learning mechanism leverages the multi-label nature where instances simultaneously belong to multiple categories, using pseudo-labels as soft membership weights to enhance discriminability. By integrating modal alignment with prototype-guided refinement, PML-MA ensures pseudo-labels better reflect the true distribution while maintaining robustness against label noise. Extensive experiments on both real-world and synthetic datasets demonstrate that PML-MA significantly outperforms state-of-the-art methods, achieving superior classification accuracy and noise robustness.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes PML-MA, a partial multi-label learning method that treats features and labels as complementary modalities. It generates pseudo-labels via low-rank orthogonal decomposition to filter noise from candidate labels, performs global projection into a shared subspace plus local neighborhood preservation for modal alignment, and applies multi-peak class prototype learning using pseudo-labels as soft weights to enhance discriminability. Extensive experiments on real-world and synthetic datasets are reported to show that PML-MA outperforms state-of-the-art methods in classification accuracy and noise robustness.

Significance. If the low-rank decomposition step reliably recovers pseudo-label distributions close to the ground truth and the subsequent alignment steps demonstrably improve feature-label consistency, the framework could advance robust PML by explicitly leveraging cross-modal structure and multi-label softness. The combination of global-local alignment with prototype-guided refinement is a coherent design choice that addresses both noise and the multi-label nature of the problem; credit is due for evaluating on both real and synthetic data to test noise robustness.

major comments (2)
  1. [Abstract and §3] Abstract and §3 (Method): The central claim that low-rank orthogonal decomposition 'approximates the true label distribution by filtering noisy labels' is load-bearing, yet no theoretical bound, convergence guarantee, or sensitivity analysis is provided for the choice of rank k. Because the resulting pseudo-labels are directly input to the global-local alignment objective and the multi-peak prototype learning, any systematic bias from rank misspecification or violation of the low-rank assumption on the candidate label matrix would propagate to the final classifier; the manuscript should either derive a recovery guarantee or include a cross-validation procedure for k.
  2. [§4] §4 (Experiments): The reported superiority in accuracy and noise robustness is not accompanied by ablation studies that isolate the contribution of the low-rank decomposition versus the alignment and prototype components, nor by results when the candidate label matrix deviates from low-rank structure (e.g., high-noise regimes or non-low-rank synthetic constructions). Without these controls it is difficult to attribute gains specifically to the proposed modal alignment rather than to hyper-parameter tuning.
minor comments (2)
  1. [§3] The notation for the orthogonal decomposition, projection matrices, and neighborhood graphs should be introduced with explicit equations and variable definitions at the start of the method section to improve readability.
  2. [§4] Figure captions and axis labels in the experimental plots should explicitly state the noise ratio and dataset names for each curve to allow direct comparison with the text claims.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback and positive assessment of the PML-MA framework's design and evaluation on real/synthetic data. We address each major comment below with planned revisions to strengthen the manuscript.

read point-by-point responses
  1. Referee: [Abstract and §3] Abstract and §3 (Method): The central claim that low-rank orthogonal decomposition 'approximates the true label distribution by filtering noisy labels' is load-bearing, yet no theoretical bound, convergence guarantee, or sensitivity analysis is provided for the choice of rank k. Because the resulting pseudo-labels are directly input to the global-local alignment objective and the multi-peak prototype learning, any systematic bias from rank misspecification or violation of the low-rank assumption on the candidate label matrix would propagate to the final classifier; the manuscript should either derive a recovery guarantee or include a cross-validation procedure for k.

    Authors: We acknowledge that the manuscript lacks a theoretical recovery guarantee or convergence analysis for the low-rank orthogonal decomposition. Deriving a general bound is non-trivial given the instance-specific and unstructured nature of label noise in PML, which would require strong assumptions unlikely to hold across real-world datasets. However, we will revise §3 to include a sensitivity analysis evaluating performance for a range of rank k values on all datasets, and we will add a cross-validation procedure for selecting k (based on label matrix reconstruction error or validation accuracy). This directly addresses potential bias propagation from rank misspecification while clarifying the empirical robustness of the pseudo-label generation step. revision: partial

  2. Referee: [§4] §4 (Experiments): The reported superiority in accuracy and noise robustness is not accompanied by ablation studies that isolate the contribution of the low-rank decomposition versus the alignment and prototype components, nor by results when the candidate label matrix deviates from low-rank structure (e.g., high-noise regimes or non-low-rank synthetic constructions). Without these controls it is difficult to attribute gains specifically to the proposed modal alignment rather than to hyper-parameter tuning.

    Authors: We agree that isolating component contributions and testing under low-rank violations would strengthen the experimental claims. In the revised §4, we will add ablation studies removing the low-rank decomposition (using raw candidates), disabling global/local alignment, and omitting multi-peak prototypes, reporting accuracy drops on all datasets. We will also construct additional synthetic datasets with deliberately non-low-rank candidate matrices (e.g., via high-rank structured noise or uncorrelated labels) and high-noise regimes, evaluating PML-MA performance to demonstrate the necessity of the low-rank step and modal alignment for robustness. revision: yes

Circularity Check

0 steps flagged

No circularity: pipeline described without equations or self-referential reductions

full rationale

The abstract and summary present a high-level pipeline: low-rank orthogonal decomposition for pseudo-label generation, followed by global-local feature-label alignment and multi-peak prototype learning. No equations, fitted parameters renamed as predictions, self-citations, or ansatzes are quoted that would reduce any claim to its inputs by construction. The method is introduced as a novel integration of existing ideas (low-rank decomposition, modal alignment) without load-bearing self-referential steps. This matches the default case of a self-contained description with no detectable circularity.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The approach rests on the domain assumption that features and labels share an underlying structure that can be recovered by alignment, plus standard low-rank and neighborhood-preservation assumptions common in ML. No new physical entities are postulated.

free parameters (1)
  • rank of orthogonal decomposition
    The rank parameter must be chosen or tuned to filter noise; its value directly affects the pseudo-label quality.
axioms (1)
  • domain assumption Features and labels can be treated as complementary modalities whose consistency can be restored by projection and neighborhood preservation.
    Invoked when the paper states that alignment restores correspondence disrupted by noisy labels.

pith-pipeline@v0.9.0 · 5508 in / 1143 out tokens · 41825 ms · 2026-05-10T17:26:42.762468+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

61 extracted references · 61 canonical work pages

  1. [1]

    The emerging trends of multi-label learning,

    W. Liu, H. Wang, X. Shen, and I. W. Tsang, “The emerging trends of multi-label learning,”IEEE transactions on pattern analysis and machine intelligence, vol. 44, no. 11, pp. 7955–7974, 2021

  2. [2]

    Hierarchical text classification with multi-label contrastive learning and knn,

    J. Zhang, Y . Li, F. Shen, Y . He, H. Tan, and Y . He, “Hierarchical text classification with multi-label contrastive learning and knn,”Neurocom- puting, vol. 577, p. 127323, 2024

  3. [3]

    Learning in imperfect environment: Multi-label classification with long-tailed distri- bution and partial labels,

    W. Zhang, C. Liu, L. Zeng, B. Ooi, S. Tang, and Y . Zhuang, “Learning in imperfect environment: Multi-label classification with long-tailed distri- bution and partial labels,” inProceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 1423–1432

  4. [4]

    A neural network-based multi-label classifier for protein function prediction,

    S. Tahzeeb and S. Hasan, “A neural network-based multi-label classifier for protein function prediction,”Engineering, Technology & Applied Science Research, vol. 12, no. 1, pp. 7974–7981, 2022

  5. [5]

    Partial multi-label learning,

    M.-K. Xie and S.-J. Huang, “Partial multi-label learning,” inProceedings of the AAAI conference on artificial intelligence, 2018, pp. 4302–4309

  6. [6]

    Multi-kernel learning for multi-label classification with local rademacher complexity,

    Z. Wang, D. Chen, and X. Che, “Multi-kernel learning for multi-label classification with local rademacher complexity,”Information Sciences, vol. 647, p. 119462, 2023

  7. [7]

    Ml-knn: A lazy learning approach to multi-label learning,

    M.-L. Zhang and Z.-H. Zhou, “Ml-knn: A lazy learning approach to multi-label learning,”Pattern recognition, vol. 40, no. 7, pp. 2038–2048, 2007

  8. [8]

    Lift: Multi-label learning with label-specific features,

    M.-L. Zhang and L. Wu, “Lift: Multi-label learning with label-specific features,”IEEE transactions on pattern analysis and machine intelli- gence, vol. 37, no. 1, pp. 107–120, 2014

  9. [9]

    Partial multi- label learning based on near-far neighborhood label enhancement and nonlinear guidance,

    Y . Chen, Y . Wu, N. Han, X. Fang, B. Chen, and J. Wen, “Partial multi- label learning based on near-far neighborhood label enhancement and nonlinear guidance,” inProceedings of the 32nd ACM International Conference on Multimedia, 2024, pp. 3722–3731. IEEE TRANSACTIONS ON MULTIMEDIA 12

  10. [10]

    Partial multi-label learning via credible label elicitation,

    M.-L. Zhang and J.-P. Fang, “Partial multi-label learning via credible label elicitation,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 10, pp. 3587–3599, 2020

  11. [11]

    Progressive enhancement of label distributions for partial multilabel learning,

    N. Xu, Y .-P. Liu, Y . Zhang, and X. Geng, “Progressive enhancement of label distributions for partial multilabel learning,”IEEE transactions on neural networks and learning systems, vol. 34, no. 8, pp. 4856–4867, 2023

  12. [12]

    Feature-induced partial multi-label learning,

    G. Yu, X. Chen, C. Domeniconi, J. Wang, Z. Li, Z. Zhang, and X. Wu, “Feature-induced partial multi-label learning,” in2018 IEEE international conference on data mining (ICDM). IEEE, 2018, pp. 1398–1403

  13. [13]

    Partial multi-label learning by low-rank and sparse decomposition,

    L. Sun, S. Feng, T. Wang, C. Lang, and Y . Jin, “Partial multi-label learning by low-rank and sparse decomposition,” inProceedings of the AAAI conference on artificial intelligence, vol. 33, 2019, pp. 5016–5023

  14. [14]

    Negative label and noise information guided disambiguation for partial multi-label learning,

    J. Zhong, R. Shang, F. Zhao, W. Zhang, and S. Xu, “Negative label and noise information guided disambiguation for partial multi-label learning,”IEEE Transactions on Multimedia, 2024

  15. [15]

    Multi-label learning with global and local label correlation,

    Y . Zhu, J. T. Kwok, and Z.-H. Zhou, “Multi-label learning with global and local label correlation,”IEEE Transactions on Knowledge and Data Engineering, vol. 30, no. 6, pp. 1081–1094, 2018

  16. [16]

    Multi-label classification with high-rank and high-order label correlations,

    C. Si, Y . Jia, R. Wang, M.-L. Zhang, Y . Feng, and C. Qu, “Multi-label classification with high-rank and high-order label correlations,”IEEE Transactions on Knowledge and Data Engineering, vol. 36, no. 8, pp. 4076–4088, 2023

  17. [17]

    Multi-label classification: An overview,

    G. Tsoumakas and I. Katakis, “Multi-label classification: An overview,” Data Warehousing and Mining: Concepts, Methodologies, Tools, and Applications, pp. 64–74, 2008

  18. [18]

    Multi-label manifold learning,

    P. Hou, X. Geng, and M.-L. Zhang, “Multi-label manifold learning,” in Proceedings of the AAAI conference on artificial intelligence, vol. 30, no. 1, 2016

  19. [19]

    Manifold regularized discriminative feature selection for multi-label learning,

    J. Zhang, Z. Luo, C. Li, C. Zhou, and S. Li, “Manifold regularized discriminative feature selection for multi-label learning,”Pattern Recog- nition, vol. 95, pp. 136–150, 2019

  20. [20]

    Multi-dimensional classification via de- composed label encoding,

    B.-B. Jia and M.-L. Zhang, “Multi-dimensional classification via de- composed label encoding,”IEEE Transactions on Knowledge and Data Engineering, vol. 35, no. 2, pp. 1844–1856, 2021

  21. [21]

    Partial label learning based on disambiguation correction net with graph representation,

    J. Fan, Y . Yu, Z. Wang, and J. Gu, “Partial label learning based on disambiguation correction net with graph representation,”IEEE Transactions on Circuits and Systems for Video Technology, vol. 32, no. 8, pp. 4953–4967, 2021

  22. [22]

    Adaptive graph guided dis- ambiguation for partial label learning,

    D.-B. Wang, L. Li, and M.-L. Zhang, “Adaptive graph guided dis- ambiguation for partial label learning,” inProceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, 2019, pp. 83–91

  23. [23]

    Variational label enhancement for instance-dependent partial label learning,

    N. Xu, C. Qiao, Y . Zhao, X. Geng, and M.-L. Zhang, “Variational label enhancement for instance-dependent partial label learning,”IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024

  24. [24]

    Learning from noisy labels via dynamic loss thresholding,

    H. Yang, Y .-Z. Jin, Z.-Y . Li, D.-B. Wang, X. Geng, and M.-L. Zhang, “Learning from noisy labels via dynamic loss thresholding,”IEEE Transactions on Knowledge and Data Engineering, 2023

  25. [25]

    Disambiguation-free partial label learning,

    M.-L. Zhang, F. Yu, and C.-Z. Tang, “Disambiguation-free partial label learning,”IEEE Transactions on Knowledge and Data Engineering, vol. 29, no. 10, pp. 2155–2167, 2017

  26. [26]

    Partial multi-label feature selection via low-rank and sparse factorization with manifold learning,

    Z. Sun, Z. Chen, J. Liu, Y . Chen, and Y . Yu, “Partial multi-label feature selection via low-rank and sparse factorization with manifold learning,” Knowledge-Based Systems, vol. 296, p. 111899, 2024

  27. [27]

    Towards enabling binary decomposition for partial multi-label learning,

    B.-Q. Liu, B.-B. Jia, and M.-L. Zhang, “Towards enabling binary decomposition for partial multi-label learning,”IEEE transactions on pattern analysis and machine intelligence, 2023

  28. [28]

    Partial multi-label feature selection based on label distribution learning,

    Y . Lin, Y . Li, S. Lin, L. Guo, and Y . Mao, “Partial multi-label feature selection based on label distribution learning,”Pattern Recognition, vol. 164, p. 111523, 2025

  29. [29]

    Partial multi-label learning with noisy label identification,

    M.-K. Xie and S.-J. Huang, “Partial multi-label learning with noisy label identification,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 7, pp. 3676–3687, 2021

  30. [30]

    Discrimi- native and correlative partial multi-label learning

    H. Wang, W. Liu, Y . Zhao, C. Zhang, T. Hu, and G. Chen, “Discrimi- native and correlative partial multi-label learning.” inIJCAI, 2019, pp. 3691–3697

  31. [31]

    Pseudo- label reconstruction for partial multi-label learning,

    Y . Chen, F. Li, N. Han, G. Li, H. Gao, S. Chan, and X. Fang, “Pseudo- label reconstruction for partial multi-label learning,” inProceedings of the Thirty-Fourth International Joint Conference on Artificial Intelli- gence, IJCAI-25, 2025, pp. 4896–4904

  32. [32]

    Partial multi-label learning via multi- subspace representation,

    Z. Li, G. Lyu, and S. Feng, “Partial multi-label learning via multi- subspace representation,” inProceedings of the Twenty-Ninth Inter- national Conference on International Joint Conferences on Artificial Intelligence, 2021, pp. 2612–2618

  33. [33]

    Partial multi-label learning via specific label disambiguation,

    F. Li, S. Shi, and H. Wang, “Partial multi-label learning via specific label disambiguation,”Knowledge-Based Systems, vol. 250, p. 109093, 2022

  34. [34]

    Dual noise elimination and dynamic label correlation guided partial multi-label learning,

    Y . Hu, X. Fang, P. Kang, Y . Chen, Y . Fang, and S. Xie, “Dual noise elimination and dynamic label correlation guided partial multi-label learning,”IEEE Transactions on Multimedia, 2023

  35. [35]

    Learning accurate label- specific features from partially multilabeled data,

    T. Xu, Y . Xu, S. Yang, B. Li, and W. Zhang, “Learning accurate label- specific features from partially multilabeled data,”IEEE Transactions on Neural Networks and Learning Systems, 2023

  36. [36]

    Partial multi-label feature selection,

    J. Wang, P. Li, and K. Yu, “Partial multi-label feature selection,” in2022 International Joint Conference on Neural Networks (IJCNN). IEEE, 2022, pp. 1–9

  37. [37]

    Partial multi-label feature selection via subspace optimization,

    P. Hao, L. Hu, and W. Gao, “Partial multi-label feature selection via subspace optimization,”Information Sciences, vol. 648, p. 119556, 2023

  38. [38]

    Integrating label confidence-based feature selection for partial multi-label learning,

    Q. Han, L. Hu, and W. Gao, “Integrating label confidence-based feature selection for partial multi-label learning,”Pattern Recognition, vol. 161, p. 111281, 2025

  39. [39]

    Partial multi-label feature selection with feature noise,

    Y . Wu, P. Li, and Y . Zou, “Partial multi-label feature selection with feature noise,”Pattern Recognition, vol. 162, p. 111310, 2025

  40. [40]

    Partial multi-label learning based on sparse asymmetric label correlations,

    P. Zhao, S. Zhao, X. Zhao, H. Liu, and X. Ji, “Partial multi-label learning based on sparse asymmetric label correlations,”Knowledge- Based Systems, vol. 245, p. 108601, 2022

  41. [41]

    Global-local label correla- tion for partial multi-label learning,

    L. Sun, S. Feng, J. Liu, G. Lyu, and C. Lang, “Global-local label correla- tion for partial multi-label learning,”IEEE Transactions on Multimedia, vol. 24, pp. 581–593, 2021

  42. [42]

    Partial multi-label learning with label and classifier correlations,

    K. Wang, Y . Guan, Y . Xie, Z. Jia, H. Ye, Z. Duan, and D. Liang, “Partial multi-label learning with label and classifier correlations,”Information Sciences, vol. 712, p. 122101, 2025

  43. [43]

    Fuzzy bifocal disambiguation for partial multi-label learning,

    X. Fang, X. Hu, Y . Hu, Y . Chen, S. Xie, and N. Han, “Fuzzy bifocal disambiguation for partial multi-label learning,”Neural Networks, vol. 185, p. 107137, 2025

  44. [44]

    Adversarial partial multi-label learning with label disambiguation,

    Y . Yan and Y . Guo, “Adversarial partial multi-label learning with label disambiguation,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 35, 2021, pp. 10 568–10 576

  45. [45]

    Pml-ed: A method of partial multi-label learning by using encoder-decoder framework and exploring label correlation,

    Z. Wang, F. Liu, M. Han, H. Tang, and B. Wan, “Pml-ed: A method of partial multi-label learning by using encoder-decoder framework and exploring label correlation,”Information Sciences, vol. 661, p. 120165, 2024

  46. [46]

    Partial multi-label learning with prob- abilistic graphical disambiguation,

    J.-Y . Hang and M.-L. Zhang, “Partial multi-label learning with prob- abilistic graphical disambiguation,”Advances in Neural Information Processing Systems, vol. 36, pp. 1339–1351, 2023

  47. [47]

    Partial multilabel learning using noise-tolerant broad learning system with label enhancement and dimensionality reduction,

    W. Qian, Y . Tu, J. Huang, W. Shu, and Y .-M. Cheung, “Partial multilabel learning using noise-tolerant broad learning system with label enhancement and dimensionality reduction,”IEEE Transactions on Neural Networks and Learning Systems, 2024

  48. [48]

    Canonical correlation analysis,

    D. Weenink, “Canonical correlation analysis,” inProceedings of the Institute of Phonetic Sciences of the University of Amsterdam, vol. 25. University of Amsterdam Amsterdam, 2003, pp. 81–99

  49. [49]

    Partial least squares,

    J. Cha, “Partial least squares,”Adv. Methods Mark. Res, vol. 407, pp. 52–78, 1994

  50. [50]

    Asymptotics-aware multi-view subspace clustering,

    Y . Xu, S. Chen, J. Li, and J. Yang, “Asymptotics-aware multi-view subspace clustering,”IEEE Transactions on Multimedia, vol. 27, pp. 3650–3663, 2025

  51. [51]

    Nonconvex low-rank tensor representation for multi-view subspace clustering with insufficient observed samples,

    M. Ding, J.-H. Yang, X.-L. Zhao, J. Zhang, and M. K. Ng, “Nonconvex low-rank tensor representation for multi-view subspace clustering with insufficient observed samples,”IEEE Transactions on Knowledge and Data Engineering, 2025

  52. [52]

    The mir flickr retrieval evaluation,

    M. J. Huiskes and M. S. Lew, “The mir flickr retrieval evaluation,” in Proceedings of the 1st ACM international conference on Multimedia information retrieval, 2008, pp. 39–43

  53. [53]

    Multi- label classification of music into emotions

    K. Trohidis, G. Tsoumakas, G. Kalliris, I. P. Vlahavaset al., “Multi- label classification of music into emotions.” inISMIR, vol. 8, 2008, pp. 325–330

  54. [54]

    Acoustic classification of mul- tiple simultaneous bird species: A multi-instance multi-label approach,

    F. Briggs, B. Lakshminarayanan, L. Neal, X. Z. Fern, R. Raich, S. J. Hadley, A. S. Hadley, and M. G. Betts, “Acoustic classification of mul- tiple simultaneous bird species: A multi-instance multi-label approach,” The Journal of the Acoustical Society of America, vol. 131, no. 6, pp. 4640–4650, 2012

  55. [55]

    The challenge problem for automated detection of 101 semantic concepts in multimedia,

    C. G. Snoek, M. Worring, J. C. Van Gemert, J.-M. Geusebroek, and A. W. Smeulders, “The challenge problem for automated detection of 101 semantic concepts in multimedia,” inProceedings of the 14th ACM international conference on Multimedia, 2006, pp. 421–430

  56. [56]

    A review on multi-label learning algo- rithms,

    M.-L. Zhang and Z.-H. Zhou, “A review on multi-label learning algo- rithms,”IEEE transactions on knowledge and data engineering, vol. 26, no. 8, pp. 1819–1837, 2013

  57. [57]

    Statistical comparisons of classifiers over multiple data sets,

    J. Dem ˇsar, “Statistical comparisons of classifiers over multiple data sets,” The Journal of Machine learning research, vol. 7, pp. 1–30, 2006. IEEE TRANSACTIONS ON MULTIMEDIA 13 APPENDIXA OPTIMIZATION A. Overall Objective Function min P1,P2,Q, R,V,D ∥XP1 −RP 2∥2 F +∥Y−RQ ⊤∥2 F +λ∥R∥ ∗ +α nX i=1 nX j=1 sij∥P⊤ 1 xi −P ⊤ 2 rj∥2 2 +β nX i=1 xidii − cX j=1 r...

  58. [58]

    (15) is simplified as: min P1,P2 ∥XP1 −RP 2∥2 F +α nX i=1 nX j=1 sij∥P⊤ 1 xi −P ⊤ 2 rj∥2 2 +γ∥P 2P⊤ 1 ∥2 F ,s.t.P ⊤ 2 P2 =I m

    UpdateP 1 andP 2:Removing the terms that are irrele- vant toP 1 andP 2, the suboptimization problem derived from Eq. (15) is simplified as: min P1,P2 ∥XP1 −RP 2∥2 F +α nX i=1 nX j=1 sij∥P⊤ 1 xi −P ⊤ 2 rj∥2 2 +γ∥P 2P⊤ 1 ∥2 F ,s.t.P ⊤ 2 P2 =I m. (16) The local alignment term in Eq. (16) can be converted to trace form: nX i=1 nX j=1 sij∥P⊤ 1 xi −P ⊤ 2 rj∥2 2...

  59. [59]

    Then, the suboptimization problem derived in Eq

    UpdateQ:Removing the items that are irrelevant toQ, and fixR. Then, the suboptimization problem derived in Eq. (15) is simplified as: min Q ∥Y−RQ ⊤∥2 F ,s.t.Q ⊤Q=I c.(24) Since∥Y−RQ ⊤∥2 F =T r(Y ⊤Y+RQ ⊤QR⊤ −2QR ⊤Y) andQ ⊤Q=I c, forY ⊤YandRQ ⊤QR⊤, these values are constant. Alternatively, the optimization problems in Eq. (24) can be stated as maximizing th...

  60. [60]

    Then, the suboptimization problem derived in Eq

    UpdateV:Removing the items that are irrelevant toV, and fixRandD. Then, the suboptimization problem derived in Eq. (15) is simplified as: min V nX i=1 xidii − cX j=1 rijvj 2 2 =∥X ⊤D−VR ⊤∥2 F . (27) After differentiating Eq. (27), set the derivative to 0, the updated formula forVcan be obtained as follows: Vt+1 =X ⊤ 1 DR(R⊤R)−1.(28) or vt+1 j = Pn i=1 rij...

  61. [61]

    First, we update the scalar weight matrixDusing the result of the current iteration round Rt: dt+1 ii = cX j=1 rt ij.(30) Then, the suboptimization problem forRderived from Eq

    UpdateRandD:Removing the terms that are irrelevant toR, and fixingP 1,P 2,QandV. First, we update the scalar weight matrixDusing the result of the current iteration round Rt: dt+1 ii = cX j=1 rt ij.(30) Then, the suboptimization problem forRderived from Eq. (15) is: min R ∥XP1 −RP 2∥2 F +∥Y−RQ ⊤∥2 F +λ∥R∥ ∗ +α nX i=1 nX j=1 sij∥P⊤ 1 xi −P ⊤ 2 rj∥2 2 +β∥X ...