Worse than Random: The Importance of a Baseline for Unsupervised Feature Selection
Pith reviewed 2026-05-25 05:50 UTC · model grok-4.3
The pith
Many state-of-the-art unsupervised feature selection methods perform worse than random feature selection.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We propose using random feature selection as a baseline for evaluating the unsupervised feature selection methods. We empirically show that many of the state-of-the-art methods in unsupervised feature selection are outperformed by random feature selection in both performance and efficiency. Accordingly, we emphasize on the strict requirement of considering random feature selection as a baseline in the development process of novel unsupervised feature selection methods to ensure a consistent improvement over random feature selection.
What carries the argument
Random feature selection employed as an evaluation baseline to measure whether unsupervised feature selection methods add value beyond chance.
Load-bearing premise
The chosen datasets, evaluation metrics, and implementation of random selection form a fair and representative test of whether a method adds value beyond chance.
What would settle it
A study showing that multiple state-of-the-art methods consistently outperform random feature selection across a broader collection of datasets and metrics would falsify the central claim.
Figures
read the original abstract
Many novel unsupervised feature selection methods are proposed each year, yet their empirical evaluation is limited to supervised and unsupervised evaluation metrics computed on selected datasets, along with comparisons to existing methods. However, in the absence of an established evaluation baseline, it is difficult to determine the value added to the existing literature by each of these methods, and how effective their underlying approaches are. We propose using random feature selection as a baseline for evaluating the unsupervised feature selection methods. We empirically show that many of the state-of-the-art methods in unsupervised feature selection are outperformed by random feature selection in both performance and efficiency. Accordingly, we emphasize on the strict requirement of considering random feature selection as a baseline in the development process of novel unsupervised feature selection methods to ensure a consistent improvement over random feature selection.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes using random feature selection as a baseline for evaluating unsupervised feature selection methods. It presents an empirical comparison claiming that many state-of-the-art unsupervised feature selection methods are outperformed by random selection in both performance metrics and efficiency, and argues that novel methods must demonstrate consistent improvement over random to be considered valuable additions to the literature.
Significance. If the empirical results hold under properly controlled conditions, the work would be significant for establishing a minimal sanity-check baseline in unsupervised feature selection research, where many methods currently lack evidence of adding value beyond chance. This could encourage more rigorous evaluation practices and reduce publication of methods whose performance is indistinguishable from or inferior to random selection. The inclusion of efficiency comparisons alongside performance is a positive aspect of the study design.
major comments (2)
- [Abstract] Abstract: the central claim that many SOTA methods are outperformed by random selection is only weakly supported because the abstract (and by extension the reported experiments) provides no details on dataset count, statistical testing, exact random implementation, or handling of ties. This directly undermines the reader's weakest assumption that the chosen datasets, metrics, and random baseline form a fair test.
- [Experiments] Experimental design: to validly claim that random outperforms the methods, the random baseline must select precisely the same number of features k as each compared method and average performance over multiple independent draws rather than a single trial or fixed global k. Without this, any apparent outperformance could arise from mismatched cardinality or sampling variance rather than from the methods adding no value.
minor comments (1)
- [Abstract] The abstract uses 'strict requirement'; this could be rephrased as a strong recommendation to avoid implying an absolute mandate.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. The comments highlight important aspects of experimental rigor that we will address in the revision.
read point-by-point responses
-
Referee: [Abstract] Abstract: the central claim that many SOTA methods are outperformed by random selection is only weakly supported because the abstract (and by extension the reported experiments) provides no details on dataset count, statistical testing, exact random implementation, or handling of ties. This directly undermines the reader's weakest assumption that the chosen datasets, metrics, and random baseline form a fair test.
Authors: The abstract is space-constrained by design, but the manuscript body reports the datasets, metrics, and random selection procedure. We agree that explicit statistical testing, precise random implementation details, and tie handling are needed for full transparency. In revision we will add these elements to the experiments section (including significance tests and clarification that random selection is uniform sampling without replacement) and update the abstract to reference the dataset count and averaged random baseline. revision: yes
-
Referee: [Experiments] Experimental design: to validly claim that random outperforms the methods, the random baseline must select precisely the same number of features k as each compared method and average performance over multiple independent draws rather than a single trial or fixed global k. Without this, any apparent outperformance could arise from mismatched cardinality or sampling variance rather than from the methods adding no value.
Authors: We agree this is a necessary control. The current experiments already match k exactly per method and dataset; however, to eliminate sampling variance we will revise the protocol to average random performance over multiple independent draws (reporting means and standard deviations) rather than single trials. This change will be implemented and documented in the revised manuscript. revision: yes
Circularity Check
Empirical comparison study with no derivation chain or fitted inputs
full rationale
The paper is an empirical evaluation that proposes random feature selection as a baseline and reports that many existing unsupervised FS methods underperform it on chosen datasets and metrics. No mathematical derivation, equations, fitted parameters, ansatz, or uniqueness theorems are present in the abstract or described structure. No self-citations are invoked as load-bearing support for any claim. The central result rests on direct experimental comparisons rather than any reduction of outputs to inputs by construction. This matches the default expectation for non-circular empirical work.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Standard unsupervised feature selection evaluation metrics accurately reflect method quality.
Reference graph
Works this paper leans on
-
[1]
Ludmila I. Kuncheva , editor =. A stability index for feature selection , booktitle =. 2007 , timestamp =
work page 2007
-
[2]
Werner Mostert and Katherine M. Malan and Andries P. Engelbrecht , title =. Algorithms , volume =. 2021 , Xurl =
work page 2021
-
[3]
Unsupervised feature selection by learning exponential weights , journal =
Chenchen Wang and Jun Wang and Zhichen Gu and Jin-Mao Wei and Jian Liu , keywords =. Unsupervised feature selection by learning exponential weights , journal =. 2024 , xissn =
work page 2024
-
[4]
Guo, Yu and Sun, Yuan and Wang, Zheng and Nie, Feiping and Wang, Fei , Xjournal=. IEEE TNNLS , title=. 2023 , volume=
work page 2023
-
[5]
Duanzhang Li and Hongmei Chen and Yong Mi and Chuan Luo and Shi-Jinn Horng and Tianrui Li , keywords =. Unsupervised feature selection via dual space-based low redundancy scores and extended OLSDA , Xjournal =. Inf. Sci. , volume =. 2024 , xissn =
work page 2024
-
[6]
Unsupervised feature selection guided by orthogonal representation of feature space , journal =
Mahsa Samareh Jahani and Gholamreza Aghamollaei and Mahdi Eftekhari and Farid Saberi. Unsupervised feature selection guided by orthogonal representation of feature space , journal =. 2023 , Xurl =
work page 2023
-
[7]
Zhiwen Cao and Xijiong Xie and Feixiang Sun and Jiabei Qian , title =. Knowl. Based Syst. , volume =. 2023 , Xurl =
work page 2023
-
[8]
Dan Shi and Lei Zhu and Jingjing Li and Zheng Zhang and Xiaojun Chang , title =. 2023 , Xurl =
work page 2023
-
[9]
Wei Zheng and Xiaofeng Zhu and Guoqiu Wen and Yonghua Zhu and Hao Yu and Jiangzhang Gan , title =. Pattern Recognit. Lett. , volume =. 2020 , Xurl =
work page 2020
-
[10]
Ritam Guha and Hussain Ali Khan and Pawan Kumar Singh and Ram Sarkar and Debotosh Bhattacharjee , title =. Neural Comput. Appl. , volume =. 2021 , Xurl =
work page 2021
-
[11]
Yuling Fan and Jinghua Liu and Jianeng Tang and Peizhong Liu and Yaojin Lin and Yongzhao Du , title =. Pattern Recognit. , volume =. 2024 , Xurl =
work page 2024
-
[12]
Jia Liu and Dong Li and Wangweiyi Shan and Shulin Liu , title =. Appl. Soft Comput. , volume =. 2024 , Xurl =
work page 2024
-
[13]
Ijaz Ahmad and Chen Yao and Lin Li and Yan Chen and Zhenzhen Liu and Inam Ullah and Mohammad Shabaz and Xin Wang and Kaiyang Huang and Guanglin Li and Guoru Zhao and Oluwarotimi Williams Samuel and Shixiong Chen , title =. J. Inf. Secur. Appl. , volume =. 2024 , Xurl =
work page 2024
-
[14]
Hongyu Pan and Shanxiong Chen and Hailing Xiong , title =. Appl. Soft Comput. , volume =. 2023 , Xurl =
work page 2023
-
[15]
Pradip Dhal and Chandrashekhar Azad , title =. Appl. Intell. , volume =. 2022 , Xurls =
work page 2022
-
[16]
Girish Chandrashekar and Ferat Sahin , title =. Comput. Electr. Eng. , volume =
-
[17]
Kelly, Markelle and Longjohn, Rachel and Nottingham, Kolby , title =
-
[18]
Xiaochun Cao and Changqing Zhang and Huazhu Fu and Si Liu and Hua Zhang , title =. 2015 , Xurls =
work page 2015
-
[19]
Papadimitriou and Kenneth Steiglitz , title =
Christos H. Papadimitriou and Kenneth Steiglitz , title =. 1982 , xisbn =
work page 1982
-
[20]
Houle and Marie Kiermeier and Arthur Zimek , editor =
Michael E. Houle and Marie Kiermeier and Arthur Zimek , editor =. Clustering High-Dimensional Data , booktitle =. 2023 , xdoi =
work page 2023
- [21]
-
[22]
A survey on unsupervised outlier detection in high-dimensional numerical data , journal =
Arthur Zimek and Erich Schubert and Hans. A survey on unsupervised outlier detection in high-dimensional numerical data , journal =
- [23]
-
[24]
Co, Tomas B. , year=. Methods of Applied Mathematics for Engineers and Scientists , publisher=
-
[25]
and Hyrup, Tobias and Zimek, Arthur
Rajabinasab, Muhammad and Lautrup, Anton D. and Hyrup, Tobias and Zimek, Arthur. A Dynamic Evaluation Metric for Feature Selection. Similarity Search and Applications. 2025
work page 2025
-
[26]
Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy , author =. IEEE TPAMI , volume =. 2005 , publisher =
work page 2005
-
[27]
Applied Intelligence , volume =
Feature selection based on mutual information with correlation coefficient , author =. Applied Intelligence , volume =. 2020 , doi =
work page 2020
-
[28]
Correlation-based feature selection for machine learning , author =. 1999 , school =
work page 1999
-
[29]
Xiaofei He and Deng Cai and Partha Niyogi , title =
-
[30]
Unsupervised feature selection using feature similarity , author =. IEEE TPAMI , volume =. 2002 , Xdoi =
work page 2002
-
[31]
Pairwise dependence-based unsupervised feature selection , journal =
Hyunki Lim and Dae. Pairwise dependence-based unsupervised feature selection , journal =. 2021 , Xurls =
work page 2021
-
[32]
Pei Huang and Xiaowei Yang , title =. Pattern Recognit. , volume =. 2022 , Xurls =
work page 2022
-
[33]
Pawan Kumar and Benjamin Packer and Daphne Koller , editors =
M. Pawan Kumar and Benjamin Packer and Daphne Koller , editors =. Self-Paced Learning for Latent Variable Models , booktitle =. 2010 , urls =
work page 2010
-
[34]
Wei Zheng and Xiaofeng Zhu and Guoqiu Wen and Yonghua Zhu and Hao Yu and Jiangzhang Gan , title =. Pattern Recognit. Lett. , volume =. 2020 , urls =. doi:10.1016/J.PATREC.2018.06.029 , timestamp =
-
[35]
Chuan Luo and Jian Zheng and Tianrui Li and Hongmei Chen and Yanyong Huang and Xi Peng , keywords =. Orthogonally constrained matrix factorization for robust unsupervised feature selection with local preserving , Xjournal =. Inf. Sci. , volume =. 2022 , issn =. doi:https://doi.org/10.1016/j.ins.2021.11.068 , urls =
-
[36]
Mohsen Ghassemi Parsa and Hadi Zare and Mehdi Ghatee , title =. Eng. Appl. Artif. Intell. , volume =. 2020 , Xurls =
work page 2020
-
[37]
Haifeng Zhao and Qi Li and Zheng Wang and Feiping Nie , title =. Cogn. Comput. , volume =. 2022 , urls =. doi:10.1007/S12559-021-09875-0 , timestamp =
-
[38]
Duanzhang Li and Hongmei Chen and Yong Mi and Chuan Luo and Shi-Jinn Horng and Tianrui Li , keywords =. Unsupervised feature selection via dual space-based low redundancy scores and extended OLSDA , Xjournal =. Inf. Sci. , volume =. 2024 , issn =. doi:https://doi.org/10.1016/j.ins.2024.120227 , urls =
-
[39]
Unsupervised feature selection guided by orthogonal representation of feature space , journal =
Mahsa Samareh Jahani and Gholamreza Aghamollaei and Mahdi Eftekhari and Farid Saberi. Unsupervised feature selection guided by orthogonal representation of feature space , journal =. 2023 , Xurls =
work page 2023
-
[40]
Unsupervised feature selection based on variance-covariance subspace distance , journal =
Saeed Karami and Farid Saberi. Unsupervised feature selection based on variance-covariance subspace distance , journal =. 2023 , urls =
work page 2023
-
[42]
Zhiwen Cao and Xijiong Xie and Feixiang Sun and Jiabei Qian , title =. Knowl. Based Syst. , volume =. 2023 , urls =. doi:10.1016/J.KNOSYS.2023.110578 , timestamp =
-
[43]
Hadi and Bernard Cosgrave and Susan McKeever , title =
Ayman Taha and Ali S. Hadi and Bernard Cosgrave and Susan McKeever , title =. Expert Syst. Appl. , volume =. 2023 , urls =. doi:10.1016/J.ESWA.2022.118718 , timestamp =
-
[44]
Ronghua Shang and Jiarui Kong and Lujuan Wang and Weitong Zhang and Chao Wang and Yangyang Li and Licheng Jiao , title =. Neurocomputing , volume =. 2023 , Xurls =
work page 2023
-
[45]
Tong Liu and Rongyao Hu and Yongxin Zhu , title =. Multim. Tools Appl. , volume =. 2023 , urls =. doi:10.1007/S11042-022-13903-Y , timestamp =
-
[46]
Mengbo You and Aihong Yuan and Dongjian He and Xuelong Li , title =. Pattern Recognit. , volume =. 2023 , Xurls =
work page 2023
-
[47]
Dan Shi and Lei Zhu and Jingjing Li and Zheng Zhang and Xiaojun Chang , title =. 2023 , urls =. doi:10.1109/TIP.2023.3234497 , timestamp =
-
[48]
Ian J. Goodfellow and Jean Pouget. Generative Adversarial Nets , booktitle =. 2014 , urls =
work page 2014
-
[49]
R. J. Mach. Learn. Res. , volume =. 2021 , urls =
work page 2021
-
[50]
Sinkhorn Distances: Lightspeed Computation of Optimal Transport , booktitle =
Marco Cuturi , editors =. Sinkhorn Distances: Lightspeed Computation of Optimal Transport , booktitle =. 2013 , urls =
work page 2013
-
[51]
Modeling Tabular data using Conditional
Lei Xu and Maria Skoularidou and Alfredo Cuesta. Modeling Tabular data using Conditional. Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada , Xpages =. 2019 , urls =
work page 2019
-
[52]
Measuring the Stability of Feature Selection , booktitle =
Sarah Nogueira and Gavin Brown , Xeditors =. Measuring the Stability of Feature Selection , booktitle =. 2016 , Xurls =
work page 2016
-
[53]
Papadimitriou and Kenneth Steiglitz , title =
Christos H. Papadimitriou and Kenneth Steiglitz , title =. 1982 , isbn =
work page 1982
-
[54]
Xiaochun Cao and Changqing Zhang and Huazhu Fu and Si Liu and Hua Zhang , title =. 2015 , urls =. doi:10.1109/CVPR.2015.7298657 , timestamp =
-
[55]
Alexander Strehl and Joydeep Ghosh , title =. J. Mach. Learn. Res. , volume =. 2002 , urls =
work page 2002
-
[56]
ACM Computing Surveys (CSUR) , volume=
Feature selection: A data perspective , author=. ACM Computing Surveys (CSUR) , volume=. 2018 , publisher=
work page 2018
-
[57]
k-means++: the advantages of careful seeding , booktitle =
David Arthur and Sergei Vassilvitskii , editors =. k-means++: the advantages of careful seeding , booktitle =. 2007 , urls =
work page 2007
-
[58]
Leo Breiman and J. H. Friedman and Richard A. Olshen and C. J. Stone , title =. 1984 , isbn =
work page 1984
- [59]
-
[60]
Nicolas Bonneel and Michiel van de Panne and Sylvain Paris and Wolfgang Heidrich , title =. 2011 , urls =. doi:10.1145/2070781.2024192 , timestamp =
-
[61]
Yossi Rubner and Carlo Tomasi and Leonidas J. Guibas , title =. Int. J. Comput. Vis. , volume =. 2000 , urls =. doi:10.1023/A:1026543900054 , timestamp =
-
[62]
The Concentration of Fractional Distances , journal =
Damien Fran. The Concentration of Fractional Distances , journal =
-
[63]
Beyer and Jonathan Goldstein and Raghu Ramakrishnan and Uri Shaft , title =
Kevin S. Beyer and Jonathan Goldstein and Raghu Ramakrishnan and Uri Shaft , title =
-
[64]
Robert J. Durrant and Ata Kab. When is `nearest neighbour' meaningful:. J. Complex. , volume =
-
[65]
Michael E. Houle and Hans. Can Shared-Neighbor Distances Defeat the Curse of Dimensionality? , booktitle =
-
[66]
Houle and Marie Kiermeier and Arthur Zimek , editor =
Michael E. Houle and Marie Kiermeier and Arthur Zimek , editor =. Clustering High-Dimensional Data , booktitle =. 2023 , doi =
work page 2023
-
[67]
Outlier Detection in Arbitrarily Oriented Subspaces , booktitle =
Hans. Outlier Detection in Arbitrarily Oriented Subspaces , booktitle =
-
[68]
Alastair Anderberg and James Bailey and Ricardo J. G. B. Campello and Michael E. Houle and Henrique O. Marques and Milos Radovanovic and Arthur Zimek , title =
-
[69]
Recent methods for dimensionality reduction:
Diego Hern. Recent methods for dimensionality reduction:
-
[70]
Jones, Dayton L , booktitle=. Technology challenges for the. 2010 , organization=
work page 2010
-
[71]
M. E. Houle , title =. SISAP , year =
-
[72]
M. E. Houle and Vincent Oria and Arwa M. Wali , title =. SISAP , year =
-
[73]
Finding nearest neighbors in growth-restricted metrics , author=. STOC , Xpages=. 2002 , organization=
work page 2002
-
[74]
Navigating nets: simple algorithms for proximity search , author=. SODA , Xpages=. 2004 , organization=
work page 2004
-
[75]
Cover trees for nearest neighbor , author=. ICML , Xpages=. 2006 , organization=
work page 2006
-
[76]
An introduction to variable and feature selection , author=. JMLR , volume=. 2003 , publisher=
work page 2003
-
[77]
Efficient biased sampling for approximate clustering and outlier detection in large data sets , author=. TKDE , volume=. 2003 , publisher=
work page 2003
-
[78]
An improved density-based cluster analysis method combining genetic algorithm and data sampling for large-scale datasets , author=. CCC , Xpages=. 2013 , organization=
work page 2013
-
[79]
Randomized sampling for large data applications of SVM , author=. ICMLA , volume=. 2012 , organization=
work page 2012
-
[80]
Parallelization of spectral clustering algorithm on multi-core processors and GPGPU , author=. ACSAC , Xpages=. 2008 , organization=
work page 2008
-
[81]
Evaluating parallel logistic regression models , author=. Big Data , Xpages=. 2013 , organization=
work page 2013
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.