Multi-label feature selection based on binary hashing learning and dynamic graph constraints
Pith reviewed 2026-05-23 00:06 UTC · model grok-4.3
The pith
Binary hashing codes serve as pseudo-labels to build more reliable dynamic graphs in multi-label feature selection.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
BHDG is the first method to integrate binary hashing into multi-label learning by using low-dimensional binary hashing codes as pseudo-labels to reduce noise and improve representation robustness, constructing a dynamically constrained sample projection space based on the graph structure of these binary pseudo-labels, incorporating label graph constraints and inner product minimization within the sample space, and adding an l2,1-norm regularization term to facilitate feature selection, all optimized via the augmented Lagrangian multiplier method.
What carries the argument
Low-dimensional binary hashing codes used as pseudo-labels to construct dynamic graph constraints on the sample projection space.
If this is right
- The binary pseudo-labels yield graph structures that support more accurate feature selection than those built from continuous values.
- Adding label graph constraints and inner product minimization further improves pseudo-label quality beyond the hashing step alone.
- The l2,1-norm term combined with the binary codes enables effective selection of discriminative features across multiple labels.
- The ALM optimization successfully handles the binary variables without requiring relaxation to continuous values.
Where Pith is reading between the lines
- The same binary hashing step could be tested as a drop-in replacement for continuous pseudo-labels inside other graph-based multi-label algorithms.
- If binary codes consistently reduce label noise, they may also improve performance in related tasks such as multi-label classification without explicit feature selection.
- The approach leaves open whether the performance gain scales with increasing label cardinality or dataset size.
Load-bearing premise
Low-dimensional binary hashing codes reduce noise from irrelevant labels and produce more reliable dynamic graph structures than continuous pseudo-labels.
What would settle it
An experiment that keeps every other component of BHDG fixed but swaps the binary hashing codes for continuous pseudo-labels and measures whether performance drops on the same ten datasets would falsify the central claim.
Figures
read the original abstract
Multi-label learning poses significant challenges in extracting reliable supervisory signals from the label space. Existing approaches often employ continuous pseudo-labels to replace binary labels, improving supervisory information representation. However, these methods can introduce noise from irrelevant labels and lead to unreliable graph structures. To overcome these limitations, this study introduces a novel multi-label feature selection method called Binary Hashing and Dynamic Graph Constraint (BHDG), the first method to integrate binary hashing into multi-label learning. BHDG utilizes low-dimensional binary hashing codes as pseudo-labels to reduce noise and improve representation robustness. A dynamically constrained sample projection space is constructed based on the graph structure of these binary pseudo-labels, enhancing the reliability of the dynamic graph. To further enhance pseudo-label quality, BHDG incorporates label graph constraints and inner product minimization within the sample space. Additionally, an $l_{2,1}$-norm regularization term is added to the objective function to facilitate the feature selection process. The augmented Lagrangian multiplier (ALM) method is employed to optimize binary variables effectively. Comprehensive experiments on 10 benchmark datasets demonstrate that BHDG outperforms ten state-of-the-art methods across six evaluation metrics. BHDG achieves the highest overall performance ranking, surpassing the next-best method by an average of at least 2.7 ranks per metric, underscoring its effectiveness and robustness in multi-label feature selection.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces BHDG, a multi-label feature selection method that generates low-dimensional binary hashing codes as pseudo-labels to reduce noise from irrelevant labels, constructs a dynamically constrained sample projection space from the graph of these binary codes, incorporates label graph constraints and inner product minimization, adds an l_{2,1}-norm regularization term for feature selection, and solves the resulting objective via the augmented Lagrangian multiplier (ALM) method. Comprehensive experiments on 10 benchmark datasets are reported to show that BHDG outperforms ten state-of-the-art methods across six evaluation metrics, achieving the highest overall performance ranking.
Significance. If the reported empirical ranking holds under standard statistical validation, the work would demonstrate a practical benefit of binary pseudo-labels over continuous ones for constructing reliable dynamic graphs in multi-label feature selection. The coherent integration of hashing, graph constraints, and ALM optimization provides a clear algorithmic contribution that could be adopted in related multi-label tasks.
minor comments (2)
- [Abstract, §4] Abstract and §4 (Experiments): the claim of outperformance would be strengthened by explicit reporting of the experimental protocol, including how the 10 datasets were split, the range of hashing code lengths tested, the procedure for selecting the trade-off regularization parameters, and whether paired statistical tests (e.g., Wilcoxon) were applied to the six metrics.
- [§3] §3 (Method): the description of the dynamic graph construction from binary codes should clarify whether the graph is recomputed at each ALM iteration or only once, as this affects both computational cost and the interpretation of the “dynamic” constraint.
Simulated Author's Rebuttal
We thank the referee for the positive summary, significance assessment, and recommendation of minor revision. No specific major comments were provided in the report.
Circularity Check
No significant circularity identified
full rationale
The paper introduces an algorithmic method BHDG whose objective combines binary hashing pseudo-labels, dynamic graph construction, label-graph constraints, inner-product minimization, and l2,1 regularization, solved by ALM. Its central claim is empirical ranking on 10 external benchmark datasets across 6 metrics. No derivation chain is supplied that reduces a claimed prediction or uniqueness result to a fitted parameter or self-citation by construction; the reported superiority is therefore settled by the external tables rather than by internal redefinition of inputs.
Axiom & Free-Parameter Ledger
free parameters (2)
- trade-off parameters for regularization terms
- hashing code length
axioms (2)
- domain assumption Binary hashing codes provide more robust pseudo-labels than continuous values by reducing noise from irrelevant labels
- domain assumption The graph structure derived from binary pseudo-labels leads to a more reliable dynamic constraint on the sample projection space
Reference graph
Works this paper leans on
-
[1]
P. Dhal and C. Azad, ”A comprehensive survey on feature selection in the various fields of machine learning,” Applied Intelligence, vol. 52, no. 4, pp. 4543-4581, 2022
work page 2022
-
[2]
R. J. Urbanowicz, M. Meeker, W. La Cava, R. S. Olson, and J. H. Moore, ”Relief-based feature selection: Introduction and review,” Journal of biomedical informatics, vol. 85, pp. 189-203, 2018
work page 2018
-
[3]
W. Qian, J. Huang, F. Xu, W. Shu, and W. Ding, ”A survey on multi-label feature selection from perspectives of label fusion,” Information Fusion, vol. 100, p. 101948, 2023
work page 2023
-
[4]
Li et al., ”Feature selection: A data perspective,” ACM computing surveys (CSUR), vol
J. Li et al., ”Feature selection: A data perspective,” ACM computing surveys (CSUR), vol. 50, no. 6, pp. 1-45, 2017
work page 2017
-
[5]
C. Guo, W. Yang, C. Liu, and Z. Li, ”Iterative missing value imputation based on feature importance,” Knowl- edge and Information Systems, pp. 1-28, 2024
work page 2024
-
[6]
C. Guo, W. Yang, Z. Li, and C. Liu, ”A novel feature selection framework for incomplete data,” Chemometrics and Intelligent Laboratory Systems, p. 105193, 2024
work page 2024
-
[7]
Y . Zhang and Y . Ma, ”Non-negative multi-label feature selection with dynamic graph constraints,” Knowledge- Based Systems, vol. 238, p. 107924, 2022
work page 2022
- [8]
-
[9]
J. Hu, Y . Li, G. Xu, and W. Gao, ”Dynamic subspace dual-graph regularized multi-label feature selection,” Neurocomputing, vol. 467, pp. 184-196, 2022
work page 2022
-
[10]
L. Zhen, P. Hu, X. Wang, and D. Peng, ”Deep supervised cross-modal retrieval,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 10394-10403
work page 2019
-
[11]
L. Zhu, C. Zheng, W. Guan, J. Li, Y . Yang, and H. T. Shen, ”Multi-modal hashing for efficient multimedia retrieval: A survey,” IEEE Transactions on Knowledge and Data Engineering, vol. 36, no. 1, pp. 239-260, 2023
work page 2023
-
[12]
Z. He, Y . Lin, Z. Lin, and C. Wang, ”Multi-label feature selection via similarity constraints with non-negative matrix factorization,” Knowledge-Based Systems, vol. 297, p. 111948, 2024
work page 2024
- [13]
-
[14]
L. Jian, J. Li, K. Shu, and H. Liu, ”Multi-label informed feature selection,” in IJCAI, 2016, vol. 16, pp. 1627-33
work page 2016
- [15]
-
[16]
Y . Li, L. Hu, and W. Gao, ”Label correlations variation for robust multi-label feature selection,” Information Sciences: An International Journal, 2022
work page 2022
- [17]
-
[18]
Z. Qin, H. Chen, Y . Mi, C. Luo, S.-J. Horng, and T. Li, ”Multi-label Feature selection with adaptive graph learning and label information enhancement,” Knowledge-Based Systems, vol. 285, p. 111363, 2024
work page 2024
-
[19]
J. Ma, F. Xu, and X. Rong, ”Discriminative multi-label feature selection with adaptive graph diffusion,” Pattern Recognition, vol. 148, p. 110154, 2024
work page 2024
-
[20]
Q. Zhou, Q. Wang, Q. Gao, M. Yang, and X. Gao, ”Unsupervised Discriminative Feature Selection via Con- trastive Graph Learning,” IEEE Transactions on Image Processing, 2024
work page 2024
-
[21]
Y . Wang, X. Luo, and X.-S. Xu, ”Label embedding online hashing for cross-modal retrieval,” in Proceedings of the 28th ACM international conference on multimedia, 2020, pp. 871-879
work page 2020
-
[22]
D. Shi, L. Zhu, J. Li, Z. Zhang, and X. Chang, ”Unsupervised adaptive feature selection with binary hashing,” IEEE Transactions on Image Processing, vol. 32, pp. 838-853, 2023
work page 2023
- [23]
-
[24]
Z. Lin, M. Chen, and Y . Ma, ”The augmented lagrange multiplier method for exact recovery of corrupted low-rank matrices,” arXiv preprint arXiv:1009.5055, 2010
work page internal anchor Pith review Pith/arXiv arXiv 2010
-
[25]
G. Tsoumakas, E. Spyromitros-Xioufis, J. Vilcek, and I. Vlahavas, ”Mulan: A java library for multi-label learning,” The Journal of Machine Learning Research, vol. 12, pp. 2411-2414, 2011
work page 2011
-
[26]
A. Asuncion and D. Newman, ”UCI machine learning repository,” ed: Irvine, CA, USA, 2007
work page 2007
-
[27]
N. Spola ˆor, E. A. Cherman, M. C. Monard, and H. D. Lee, ”ReliefF for multi-label feature selection,” in 2013 Brazilian Conference on Intelligent Systems, 2013: IEEE, pp. 6-11
work page 2013
-
[28]
F. Nie, H. Huang, X. Cai, and C. Ding, ”Efficient and robust feature selection via joint l2, 1-norms minimiza- tion,” Advances in neural information processing systems, vol. 23, 2010
work page 2010
-
[29]
J. Liu, S. Ji, and J. Ye, ”Multi-task feature learning via efficient l2, 1-norm minimization,” arXiv preprint arXiv:1205.2631, 2012
work page internal anchor Pith review Pith/arXiv arXiv 2012
- [30]
- [31]
- [32]
-
[33]
W. Gao, Y . Li, and L. Hu, ”Multilabel feature selection with constrained latent structure shared term,” IEEE Transactions on Neural Networks and Learning Systems, vol. 34, no. 3, pp. 1253-1262, 2021
work page 2021
-
[34]
Y . Liu, H. Chen, T. Li, and W. Li, ”A robust graph based multi-label feature selection considering feature-label dependency,” Applied Intelligence, vol. 53, no. 1, pp. 837-863, 2023
work page 2023
-
[35]
J. Hu, Y . Li, W. Gao, and P. Zhang, ”Robust multi-label feature selection with dual-graph regularization,” Knowledge-Based Systems, vol. 203, p. 106126, 2020
work page 2020
-
[36]
M.-L. Zhang and Z.-H. Zhou, ”ML-KNN: A lazy learning approach to multi-label learning,” Pattern recognition, vol. 40, no. 7, pp. 2038-2048, 2007
work page 2038
-
[37]
Z. Sun, H. Xie, J. Liu, and Y . Yu, ”Multi-label feature selection via adaptive dual-graph optimization,” Expert Systems with Applications, vol. 243, p. 122884, 2024
work page 2024
-
[38]
J. Dem ˇsar, ”Statistical comparisons of classifiers over multiple data sets,” The Journal of Machine learning research, vol. 7, pp. 1-30, 2006. 21
work page 2006
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.