Worse than Random: The Importance of a Baseline for Unsupervised Feature Selection

Arthur Zimek; Michael E. Houle; Muhammad Rajabinasab; Oussama Chelly

arxiv: 2605.22973 · v1 · pith:PGBJCS54new · submitted 2026-05-21 · 💻 cs.LG · cs.AI

Worse than Random: The Importance of a Baseline for Unsupervised Feature Selection

Muhammad Rajabinasab , Michael E. Houle , Oussama Chelly , Arthur Zimek This is my paper

Pith reviewed 2026-05-25 05:50 UTC · model grok-4.3

classification 💻 cs.LG cs.AI

keywords unsupervised feature selectionrandom baselinefeature selection evaluationperformance comparisonefficiency analysismachine learning methods

0 comments

The pith

Many state-of-the-art unsupervised feature selection methods perform worse than random feature selection.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes random feature selection as a baseline for evaluating unsupervised feature selection methods. It demonstrates through experiments that many current state-of-the-art approaches are outperformed by random selection both in the quality of the features chosen and in the time required to choose them. Without this baseline, it remains unclear whether new methods contribute any improvement over chance. The authors therefore argue that every new method should be required to show consistent gains over random selection during development.

Core claim

We propose using random feature selection as a baseline for evaluating the unsupervised feature selection methods. We empirically show that many of the state-of-the-art methods in unsupervised feature selection are outperformed by random feature selection in both performance and efficiency. Accordingly, we emphasize on the strict requirement of considering random feature selection as a baseline in the development process of novel unsupervised feature selection methods to ensure a consistent improvement over random feature selection.

What carries the argument

Random feature selection employed as an evaluation baseline to measure whether unsupervised feature selection methods add value beyond chance.

Load-bearing premise

The chosen datasets, evaluation metrics, and implementation of random selection form a fair and representative test of whether a method adds value beyond chance.

What would settle it

A study showing that multiple state-of-the-art methods consistently outperform random feature selection across a broader collection of datasets and metrics would falsify the central claim.

Figures

Figures reproduced from arXiv: 2605.22973 by Arthur Zimek, Michael E. Houle, Muhammad Rajabinasab, Oussama Chelly.

**Figure 3.** Figure 3: By centering the results on the random base [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 2.** Figure 2: Comparison of the feature selection performance of unsupervised feature selection methods with the random baseline on the Isolet dataset [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗

**Figure 3.** Figure 3: Z-score performance relative to the Random baseline on the Isolet dataset over the extreme dimensionality reduction experiment (0.5% [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗

**Figure 4.** Figure 4: Critical difference diagram over the full range of features for different metrics based on the average performance measured by the FSDEM score. 7 6 5 4 3 2 1 5.2609 MCFS 5.2174 Correlation 4.2609 Laplacian 3.6957 Variance 3.4783VCSDFS 3.1739Random 2.9130SCFS AUC 7 6 5 4 3 2 1 5.3913 MCFS 4.7826 Correlation 4.0435 VCSDFS 3.9130 Variance 3.6087Random 3.6087SCFS 2.6522Laplacian CLSACC [PITH_FULL_IMAGE:figure… view at source ↗

**Figure 5.** Figure 5: Critical difference diagram over the first 10% of features for different metrics based on the average performance measured by the FSDEM score. 8 [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

read the original abstract

Many novel unsupervised feature selection methods are proposed each year, yet their empirical evaluation is limited to supervised and unsupervised evaluation metrics computed on selected datasets, along with comparisons to existing methods. However, in the absence of an established evaluation baseline, it is difficult to determine the value added to the existing literature by each of these methods, and how effective their underlying approaches are. We propose using random feature selection as a baseline for evaluating the unsupervised feature selection methods. We empirically show that many of the state-of-the-art methods in unsupervised feature selection are outperformed by random feature selection in both performance and efficiency. Accordingly, we emphasize on the strict requirement of considering random feature selection as a baseline in the development process of novel unsupervised feature selection methods to ensure a consistent improvement over random feature selection.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Random baseline beats many unsupervised FS methods in the reported tests, but the value hinges on whether random draws matched exact k and were averaged.

read the letter

The core point is that random feature selection often outperforms published unsupervised methods on the metrics they care about, and the authors want this to become a required check. That is the new angle here: supervised work has used random baselines for years, but this applies the same logic systematically to the unsupervised case and reports that many recent methods fall short on both quality and speed. The abstract makes the case directly without overclaiming, and the call for consistent improvement over random is reasonable on its face. Credit to the authors for surfacing an evaluation gap that has been easy to ignore. The experiments appear to be a straightforward empirical comparison with no fitted parameters or circular derivations, which keeps the burden low. The main soft spot is the implementation of the random baseline itself. If the random runs did not fix the exact number of features k per method and average across repeated draws, then any apparent win for random could come from mismatched cardinality or single-trial variance rather than from the methods truly adding nothing. The stress-test note flags exactly this, and the abstract gives no numbers on dataset count, statistical tests, or tie handling, so the central claim is only weakly supported by the text we have. The full manuscript may fix this, but it is not visible here. This is the kind of paper that belongs in a reading group for people who build or review unsupervised feature selection work. It is not a breakthrough result, but it is a useful corrective that could raise the bar if the controls are solid. A serious editor should send it to referees rather than desk-reject, mainly to get the baseline details checked and to see whether the underperformance holds under tighter conditions. I would not cite it myself unless the experiments survive that scrutiny.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes using random feature selection as a baseline for evaluating unsupervised feature selection methods. It presents an empirical comparison claiming that many state-of-the-art unsupervised feature selection methods are outperformed by random selection in both performance metrics and efficiency, and argues that novel methods must demonstrate consistent improvement over random to be considered valuable additions to the literature.

Significance. If the empirical results hold under properly controlled conditions, the work would be significant for establishing a minimal sanity-check baseline in unsupervised feature selection research, where many methods currently lack evidence of adding value beyond chance. This could encourage more rigorous evaluation practices and reduce publication of methods whose performance is indistinguishable from or inferior to random selection. The inclusion of efficiency comparisons alongside performance is a positive aspect of the study design.

major comments (2)

[Abstract] Abstract: the central claim that many SOTA methods are outperformed by random selection is only weakly supported because the abstract (and by extension the reported experiments) provides no details on dataset count, statistical testing, exact random implementation, or handling of ties. This directly undermines the reader's weakest assumption that the chosen datasets, metrics, and random baseline form a fair test.
[Experiments] Experimental design: to validly claim that random outperforms the methods, the random baseline must select precisely the same number of features k as each compared method and average performance over multiple independent draws rather than a single trial or fixed global k. Without this, any apparent outperformance could arise from mismatched cardinality or sampling variance rather than from the methods adding no value.

minor comments (1)

[Abstract] The abstract uses 'strict requirement'; this could be rephrased as a strong recommendation to avoid implying an absolute mandate.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. The comments highlight important aspects of experimental rigor that we will address in the revision.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that many SOTA methods are outperformed by random selection is only weakly supported because the abstract (and by extension the reported experiments) provides no details on dataset count, statistical testing, exact random implementation, or handling of ties. This directly undermines the reader's weakest assumption that the chosen datasets, metrics, and random baseline form a fair test.

Authors: The abstract is space-constrained by design, but the manuscript body reports the datasets, metrics, and random selection procedure. We agree that explicit statistical testing, precise random implementation details, and tie handling are needed for full transparency. In revision we will add these elements to the experiments section (including significance tests and clarification that random selection is uniform sampling without replacement) and update the abstract to reference the dataset count and averaged random baseline. revision: yes
Referee: [Experiments] Experimental design: to validly claim that random outperforms the methods, the random baseline must select precisely the same number of features k as each compared method and average performance over multiple independent draws rather than a single trial or fixed global k. Without this, any apparent outperformance could arise from mismatched cardinality or sampling variance rather than from the methods adding no value.

Authors: We agree this is a necessary control. The current experiments already match k exactly per method and dataset; however, to eliminate sampling variance we will revise the protocol to average random performance over multiple independent draws (reporting means and standard deviations) rather than single trials. This change will be implemented and documented in the revised manuscript. revision: yes

Circularity Check

0 steps flagged

Empirical comparison study with no derivation chain or fitted inputs

full rationale

The paper is an empirical evaluation that proposes random feature selection as a baseline and reports that many existing unsupervised FS methods underperform it on chosen datasets and metrics. No mathematical derivation, equations, fitted parameters, ansatz, or uniqueness theorems are present in the abstract or described structure. No self-citations are invoked as load-bearing support for any claim. The central result rests on direct experimental comparisons rather than any reduction of outputs to inputs by construction. This matches the default expectation for non-circular empirical work.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that standard unsupervised evaluation metrics and benchmark datasets are sufficient to detect whether a method exceeds chance-level performance.

axioms (1)

domain assumption Standard unsupervised feature selection evaluation metrics accurately reflect method quality.
The paper uses these metrics to declare outperformance by random selection.

pith-pipeline@v0.9.0 · 5669 in / 1023 out tokens · 21703 ms · 2026-05-25T05:50:58.623957+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

299 extracted references · 299 canonical work pages

[1]

Kuncheva , editor =

Ludmila I. Kuncheva , editor =. A stability index for feature selection , booktitle =. 2007 , timestamp =

work page 2007
[2]

Malan and Andries P

Werner Mostert and Katherine M. Malan and Andries P. Engelbrecht , title =. Algorithms , volume =. 2021 , Xurl =

work page 2021
[3]

Unsupervised feature selection by learning exponential weights , journal =

Chenchen Wang and Jun Wang and Zhichen Gu and Jin-Mao Wei and Jian Liu , keywords =. Unsupervised feature selection by learning exponential weights , journal =. 2024 , xissn =

work page 2024
[4]

IEEE TNNLS , title=

Guo, Yu and Sun, Yuan and Wang, Zheng and Nie, Feiping and Wang, Fei , Xjournal=. IEEE TNNLS , title=. 2023 , volume=

work page 2023
[5]

Unsupervised feature selection via dual space-based low redundancy scores and extended OLSDA , Xjournal =

Duanzhang Li and Hongmei Chen and Yong Mi and Chuan Luo and Shi-Jinn Horng and Tianrui Li , keywords =. Unsupervised feature selection via dual space-based low redundancy scores and extended OLSDA , Xjournal =. Inf. Sci. , volume =. 2024 , xissn =

work page 2024
[6]

Unsupervised feature selection guided by orthogonal representation of feature space , journal =

Mahsa Samareh Jahani and Gholamreza Aghamollaei and Mahdi Eftekhari and Farid Saberi. Unsupervised feature selection guided by orthogonal representation of feature space , journal =. 2023 , Xurl =

work page 2023
[7]

Zhiwen Cao and Xijiong Xie and Feixiang Sun and Jiabei Qian , title =. Knowl. Based Syst. , volume =. 2023 , Xurl =

work page 2023
[8]

2023 , Xurl =

Dan Shi and Lei Zhu and Jingjing Li and Zheng Zhang and Xiaojun Chang , title =. 2023 , Xurl =

work page 2023
[9]

Pattern Recognit

Wei Zheng and Xiaofeng Zhu and Guoqiu Wen and Yonghua Zhu and Hao Yu and Jiangzhang Gan , title =. Pattern Recognit. Lett. , volume =. 2020 , Xurl =

work page 2020
[10]

Neural Comput

Ritam Guha and Hussain Ali Khan and Pawan Kumar Singh and Ram Sarkar and Debotosh Bhattacharjee , title =. Neural Comput. Appl. , volume =. 2021 , Xurl =

work page 2021
[11]

Pattern Recognit

Yuling Fan and Jinghua Liu and Jianeng Tang and Peizhong Liu and Yaojin Lin and Yongzhao Du , title =. Pattern Recognit. , volume =. 2024 , Xurl =

work page 2024
[12]

Jia Liu and Dong Li and Wangweiyi Shan and Shulin Liu , title =. Appl. Soft Comput. , volume =. 2024 , Xurl =

work page 2024
[13]

Ijaz Ahmad and Chen Yao and Lin Li and Yan Chen and Zhenzhen Liu and Inam Ullah and Mohammad Shabaz and Xin Wang and Kaiyang Huang and Guanglin Li and Guoru Zhao and Oluwarotimi Williams Samuel and Shixiong Chen , title =. J. Inf. Secur. Appl. , volume =. 2024 , Xurl =

work page 2024
[14]

Hongyu Pan and Shanxiong Chen and Hailing Xiong , title =. Appl. Soft Comput. , volume =. 2023 , Xurl =

work page 2023
[15]

Pradip Dhal and Chandrashekhar Azad , title =. Appl. Intell. , volume =. 2022 , Xurls =

work page 2022
[16]

Girish Chandrashekar and Ferat Sahin , title =. Comput. Electr. Eng. , volume =

work page
[17]

Kelly, Markelle and Longjohn, Rachel and Nottingham, Kolby , title =

work page
[18]

2015 , Xurls =

Xiaochun Cao and Changqing Zhang and Huazhu Fu and Si Liu and Hua Zhang , title =. 2015 , Xurls =

work page 2015
[19]

Papadimitriou and Kenneth Steiglitz , title =

Christos H. Papadimitriou and Kenneth Steiglitz , title =. 1982 , xisbn =

work page 1982
[20]

Houle and Marie Kiermeier and Arthur Zimek , editor =

Michael E. Houle and Marie Kiermeier and Arthur Zimek , editor =. Clustering High-Dimensional Data , booktitle =. 2023 , xdoi =

work page 2023
[21]

ANN-Benchmarks:

Martin Aum. ANN-Benchmarks:. Inf. Syst. , volume =

work page
[22]

A survey on unsupervised outlier detection in high-dimensional numerical data , journal =

Arthur Zimek and Erich Schubert and Hans. A survey on unsupervised outlier detection in high-dimensional numerical data , journal =

work page
[23]

1991 , publisher=

An introduction to numerical analysis , author=. 1991 , publisher=

work page 1991
[24]

Co, Tomas B. , year=. Methods of Applied Mathematics for Engineers and Scientists , publisher=

work page
[25]

and Hyrup, Tobias and Zimek, Arthur

Rajabinasab, Muhammad and Lautrup, Anton D. and Hyrup, Tobias and Zimek, Arthur. A Dynamic Evaluation Metric for Feature Selection. Similarity Search and Applications. 2025

work page 2025
[26]

IEEE TPAMI , volume =

Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy , author =. IEEE TPAMI , volume =. 2005 , publisher =

work page 2005
[27]

Applied Intelligence , volume =

Feature selection based on mutual information with correlation coefficient , author =. Applied Intelligence , volume =. 2020 , doi =

work page 2020
[28]

1999 , school =

Correlation-based feature selection for machine learning , author =. 1999 , school =

work page 1999
[29]

Xiaofei He and Deng Cai and Partha Niyogi , title =

work page
[30]

IEEE TPAMI , volume =

Unsupervised feature selection using feature similarity , author =. IEEE TPAMI , volume =. 2002 , Xdoi =

work page 2002
[31]

Pairwise dependence-based unsupervised feature selection , journal =

Hyunki Lim and Dae. Pairwise dependence-based unsupervised feature selection , journal =. 2021 , Xurls =

work page 2021
[32]

Pattern Recognit

Pei Huang and Xiaowei Yang , title =. Pattern Recognit. , volume =. 2022 , Xurls =

work page 2022
[33]

Pawan Kumar and Benjamin Packer and Daphne Koller , editors =

M. Pawan Kumar and Benjamin Packer and Daphne Koller , editors =. Self-Paced Learning for Latent Variable Models , booktitle =. 2010 , urls =

work page 2010
[34]

Pattern Recognit

Wei Zheng and Xiaofeng Zhu and Guoqiu Wen and Yonghua Zhu and Hao Yu and Jiangzhang Gan , title =. Pattern Recognit. Lett. , volume =. 2020 , urls =. doi:10.1016/J.PATREC.2018.06.029 , timestamp =

work page doi:10.1016/j.patrec.2018.06.029 2020
[35]

Orthogonally constrained matrix factorization for robust unsupervised feature selection with local preserving , Xjournal =

Chuan Luo and Jian Zheng and Tianrui Li and Hongmei Chen and Yanyong Huang and Xi Peng , keywords =. Orthogonally constrained matrix factorization for robust unsupervised feature selection with local preserving , Xjournal =. Inf. Sci. , volume =. 2022 , issn =. doi:https://doi.org/10.1016/j.ins.2021.11.068 , urls =

work page doi:10.1016/j.ins.2021.11.068 2022
[36]

Mohsen Ghassemi Parsa and Hadi Zare and Mehdi Ghatee , title =. Eng. Appl. Artif. Intell. , volume =. 2020 , Xurls =

work page 2020
[37]

Haifeng Zhao and Qi Li and Zheng Wang and Feiping Nie , title =. Cogn. Comput. , volume =. 2022 , urls =. doi:10.1007/S12559-021-09875-0 , timestamp =

work page doi:10.1007/s12559-021-09875-0 2022
[38]

Unsupervised feature selection via dual space-based low redundancy scores and extended OLSDA , Xjournal =

Duanzhang Li and Hongmei Chen and Yong Mi and Chuan Luo and Shi-Jinn Horng and Tianrui Li , keywords =. Unsupervised feature selection via dual space-based low redundancy scores and extended OLSDA , Xjournal =. Inf. Sci. , volume =. 2024 , issn =. doi:https://doi.org/10.1016/j.ins.2024.120227 , urls =

work page doi:10.1016/j.ins.2024.120227 2024
[39]

Unsupervised feature selection guided by orthogonal representation of feature space , journal =

Mahsa Samareh Jahani and Gholamreza Aghamollaei and Mahdi Eftekhari and Farid Saberi. Unsupervised feature selection guided by orthogonal representation of feature space , journal =. 2023 , Xurls =

work page 2023
[40]

Unsupervised feature selection based on variance-covariance subspace distance , journal =

Saeed Karami and Farid Saberi. Unsupervised feature selection based on variance-covariance subspace distance , journal =. 2023 , urls =

work page 2023
[42]

Zhiwen Cao and Xijiong Xie and Feixiang Sun and Jiabei Qian , title =. Knowl. Based Syst. , volume =. 2023 , urls =. doi:10.1016/J.KNOSYS.2023.110578 , timestamp =

work page doi:10.1016/j.knosys.2023.110578 2023
[43]

Hadi and Bernard Cosgrave and Susan McKeever , title =

Ayman Taha and Ali S. Hadi and Bernard Cosgrave and Susan McKeever , title =. Expert Syst. Appl. , volume =. 2023 , urls =. doi:10.1016/J.ESWA.2022.118718 , timestamp =

work page doi:10.1016/j.eswa.2022.118718 2023
[44]

Neurocomputing , volume =

Ronghua Shang and Jiarui Kong and Lujuan Wang and Weitong Zhang and Chao Wang and Yangyang Li and Licheng Jiao , title =. Neurocomputing , volume =. 2023 , Xurls =

work page 2023
[45]

Tong Liu and Rongyao Hu and Yongxin Zhu , title =. Multim. Tools Appl. , volume =. 2023 , urls =. doi:10.1007/S11042-022-13903-Y , timestamp =

work page doi:10.1007/s11042-022-13903-y 2023
[46]

Pattern Recognit

Mengbo You and Aihong Yuan and Dongjian He and Xuelong Li , title =. Pattern Recognit. , volume =. 2023 , Xurls =

work page 2023
[47]

2023 , urls =

Dan Shi and Lei Zhu and Jingjing Li and Zheng Zhang and Xiaojun Chang , title =. 2023 , urls =. doi:10.1109/TIP.2023.3234497 , timestamp =

work page doi:10.1109/tip.2023.3234497 2023
[48]

Goodfellow and Jean Pouget

Ian J. Goodfellow and Jean Pouget. Generative Adversarial Nets , booktitle =. 2014 , urls =

work page 2014
[49]

R. J. Mach. Learn. Res. , volume =. 2021 , urls =

work page 2021
[50]

Sinkhorn Distances: Lightspeed Computation of Optimal Transport , booktitle =

Marco Cuturi , editors =. Sinkhorn Distances: Lightspeed Computation of Optimal Transport , booktitle =. 2013 , urls =

work page 2013
[51]

Modeling Tabular data using Conditional

Lei Xu and Maria Skoularidou and Alfredo Cuesta. Modeling Tabular data using Conditional. Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada , Xpages =. 2019 , urls =

work page 2019
[52]

Measuring the Stability of Feature Selection , booktitle =

Sarah Nogueira and Gavin Brown , Xeditors =. Measuring the Stability of Feature Selection , booktitle =. 2016 , Xurls =

work page 2016
[53]

Papadimitriou and Kenneth Steiglitz , title =

Christos H. Papadimitriou and Kenneth Steiglitz , title =. 1982 , isbn =

work page 1982
[54]

2015 , urls =

Xiaochun Cao and Changqing Zhang and Huazhu Fu and Si Liu and Hua Zhang , title =. 2015 , urls =. doi:10.1109/CVPR.2015.7298657 , timestamp =

work page doi:10.1109/cvpr.2015.7298657 2015
[55]

Alexander Strehl and Joydeep Ghosh , title =. J. Mach. Learn. Res. , volume =. 2002 , urls =

work page 2002
[56]

ACM Computing Surveys (CSUR) , volume=

Feature selection: A data perspective , author=. ACM Computing Surveys (CSUR) , volume=. 2018 , publisher=

work page 2018
[57]

k-means++: the advantages of careful seeding , booktitle =

David Arthur and Sergei Vassilvitskii , editors =. k-means++: the advantages of careful seeding , booktitle =. 2007 , urls =

work page 2007
[58]

Leo Breiman and J. H. Friedman and Richard A. Olshen and C. J. Stone , title =. 1984 , isbn =

work page 1984
[59]

2021 , Xurls =

Xinxing Wu and Qiang Cheng , title =. 2021 , Xurls =

work page 2021
[60]

2011 , urls =

Nicolas Bonneel and Michiel van de Panne and Sylvain Paris and Wolfgang Heidrich , title =. 2011 , urls =. doi:10.1145/2070781.2024192 , timestamp =

work page doi:10.1145/2070781.2024192 2011
[61]

Guibas , title =

Yossi Rubner and Carlo Tomasi and Leonidas J. Guibas , title =. Int. J. Comput. Vis. , volume =. 2000 , urls =. doi:10.1023/A:1026543900054 , timestamp =

work page doi:10.1023/a:1026543900054 2000
[62]

The Concentration of Fractional Distances , journal =

Damien Fran. The Concentration of Fractional Distances , journal =

work page
[63]

Beyer and Jonathan Goldstein and Raghu Ramakrishnan and Uri Shaft , title =

Kevin S. Beyer and Jonathan Goldstein and Raghu Ramakrishnan and Uri Shaft , title =

work page
[64]

Durrant and Ata Kab

Robert J. Durrant and Ata Kab. When is `nearest neighbour' meaningful:. J. Complex. , volume =

work page
[65]

Houle and Hans

Michael E. Houle and Hans. Can Shared-Neighbor Distances Defeat the Curse of Dimensionality? , booktitle =

work page
[66]

Houle and Marie Kiermeier and Arthur Zimek , editor =

Michael E. Houle and Marie Kiermeier and Arthur Zimek , editor =. Clustering High-Dimensional Data , booktitle =. 2023 , doi =

work page 2023
[67]

Outlier Detection in Arbitrarily Oriented Subspaces , booktitle =

Hans. Outlier Detection in Arbitrarily Oriented Subspaces , booktitle =

work page
[68]

Alastair Anderberg and James Bailey and Ricardo J. G. B. Campello and Michael E. Houle and Henrique O. Marques and Milos Radovanovic and Arthur Zimek , title =

work page
[69]

Recent methods for dimensionality reduction:

Diego Hern. Recent methods for dimensionality reduction:

work page
[70]

Technology challenges for the

Jones, Dayton L , booktitle=. Technology challenges for the. 2010 , organization=

work page 2010
[71]

M. E. Houle , title =. SISAP , year =

work page
[72]

M. E. Houle and Vincent Oria and Arwa M. Wali , title =. SISAP , year =

work page
[73]

STOC , Xpages=

Finding nearest neighbors in growth-restricted metrics , author=. STOC , Xpages=. 2002 , organization=

work page 2002
[74]

SODA , Xpages=

Navigating nets: simple algorithms for proximity search , author=. SODA , Xpages=. 2004 , organization=

work page 2004
[75]

ICML , Xpages=

Cover trees for nearest neighbor , author=. ICML , Xpages=. 2006 , organization=

work page 2006
[76]

JMLR , volume=

An introduction to variable and feature selection , author=. JMLR , volume=. 2003 , publisher=

work page 2003
[77]

TKDE , volume=

Efficient biased sampling for approximate clustering and outlier detection in large data sets , author=. TKDE , volume=. 2003 , publisher=

work page 2003
[78]

CCC , Xpages=

An improved density-based cluster analysis method combining genetic algorithm and data sampling for large-scale datasets , author=. CCC , Xpages=. 2013 , organization=

work page 2013
[79]

ICMLA , volume=

Randomized sampling for large data applications of SVM , author=. ICMLA , volume=. 2012 , organization=

work page 2012
[80]

ACSAC , Xpages=

Parallelization of spectral clustering algorithm on multi-core processors and GPGPU , author=. ACSAC , Xpages=. 2008 , organization=

work page 2008
[81]

Big Data , Xpages=

Evaluating parallel logistic regression models , author=. Big Data , Xpages=. 2013 , organization=

work page 2013

Showing first 80 references.

[1] [1]

Kuncheva , editor =

Ludmila I. Kuncheva , editor =. A stability index for feature selection , booktitle =. 2007 , timestamp =

work page 2007

[2] [2]

Malan and Andries P

Werner Mostert and Katherine M. Malan and Andries P. Engelbrecht , title =. Algorithms , volume =. 2021 , Xurl =

work page 2021

[3] [3]

Unsupervised feature selection by learning exponential weights , journal =

Chenchen Wang and Jun Wang and Zhichen Gu and Jin-Mao Wei and Jian Liu , keywords =. Unsupervised feature selection by learning exponential weights , journal =. 2024 , xissn =

work page 2024

[4] [4]

IEEE TNNLS , title=

Guo, Yu and Sun, Yuan and Wang, Zheng and Nie, Feiping and Wang, Fei , Xjournal=. IEEE TNNLS , title=. 2023 , volume=

work page 2023

[5] [5]

Unsupervised feature selection via dual space-based low redundancy scores and extended OLSDA , Xjournal =

Duanzhang Li and Hongmei Chen and Yong Mi and Chuan Luo and Shi-Jinn Horng and Tianrui Li , keywords =. Unsupervised feature selection via dual space-based low redundancy scores and extended OLSDA , Xjournal =. Inf. Sci. , volume =. 2024 , xissn =

work page 2024

[6] [6]

Unsupervised feature selection guided by orthogonal representation of feature space , journal =

Mahsa Samareh Jahani and Gholamreza Aghamollaei and Mahdi Eftekhari and Farid Saberi. Unsupervised feature selection guided by orthogonal representation of feature space , journal =. 2023 , Xurl =

work page 2023

[7] [7]

Zhiwen Cao and Xijiong Xie and Feixiang Sun and Jiabei Qian , title =. Knowl. Based Syst. , volume =. 2023 , Xurl =

work page 2023

[8] [8]

2023 , Xurl =

Dan Shi and Lei Zhu and Jingjing Li and Zheng Zhang and Xiaojun Chang , title =. 2023 , Xurl =

work page 2023

[9] [9]

Pattern Recognit

Wei Zheng and Xiaofeng Zhu and Guoqiu Wen and Yonghua Zhu and Hao Yu and Jiangzhang Gan , title =. Pattern Recognit. Lett. , volume =. 2020 , Xurl =

work page 2020

[10] [10]

Neural Comput

Ritam Guha and Hussain Ali Khan and Pawan Kumar Singh and Ram Sarkar and Debotosh Bhattacharjee , title =. Neural Comput. Appl. , volume =. 2021 , Xurl =

work page 2021

[11] [11]

Pattern Recognit

Yuling Fan and Jinghua Liu and Jianeng Tang and Peizhong Liu and Yaojin Lin and Yongzhao Du , title =. Pattern Recognit. , volume =. 2024 , Xurl =

work page 2024

[12] [12]

Jia Liu and Dong Li and Wangweiyi Shan and Shulin Liu , title =. Appl. Soft Comput. , volume =. 2024 , Xurl =

work page 2024

[13] [13]

Ijaz Ahmad and Chen Yao and Lin Li and Yan Chen and Zhenzhen Liu and Inam Ullah and Mohammad Shabaz and Xin Wang and Kaiyang Huang and Guanglin Li and Guoru Zhao and Oluwarotimi Williams Samuel and Shixiong Chen , title =. J. Inf. Secur. Appl. , volume =. 2024 , Xurl =

work page 2024

[14] [14]

Hongyu Pan and Shanxiong Chen and Hailing Xiong , title =. Appl. Soft Comput. , volume =. 2023 , Xurl =

work page 2023

[15] [15]

Pradip Dhal and Chandrashekhar Azad , title =. Appl. Intell. , volume =. 2022 , Xurls =

work page 2022

[16] [16]

Girish Chandrashekar and Ferat Sahin , title =. Comput. Electr. Eng. , volume =

work page

[17] [17]

Kelly, Markelle and Longjohn, Rachel and Nottingham, Kolby , title =

work page

[18] [18]

2015 , Xurls =

Xiaochun Cao and Changqing Zhang and Huazhu Fu and Si Liu and Hua Zhang , title =. 2015 , Xurls =

work page 2015

[19] [19]

Papadimitriou and Kenneth Steiglitz , title =

Christos H. Papadimitriou and Kenneth Steiglitz , title =. 1982 , xisbn =

work page 1982

[20] [20]

Houle and Marie Kiermeier and Arthur Zimek , editor =

Michael E. Houle and Marie Kiermeier and Arthur Zimek , editor =. Clustering High-Dimensional Data , booktitle =. 2023 , xdoi =

work page 2023

[21] [21]

ANN-Benchmarks:

Martin Aum. ANN-Benchmarks:. Inf. Syst. , volume =

work page

[22] [22]

A survey on unsupervised outlier detection in high-dimensional numerical data , journal =

Arthur Zimek and Erich Schubert and Hans. A survey on unsupervised outlier detection in high-dimensional numerical data , journal =

work page

[23] [23]

1991 , publisher=

An introduction to numerical analysis , author=. 1991 , publisher=

work page 1991

[24] [24]

Co, Tomas B. , year=. Methods of Applied Mathematics for Engineers and Scientists , publisher=

work page

[25] [25]

and Hyrup, Tobias and Zimek, Arthur

Rajabinasab, Muhammad and Lautrup, Anton D. and Hyrup, Tobias and Zimek, Arthur. A Dynamic Evaluation Metric for Feature Selection. Similarity Search and Applications. 2025

work page 2025

[26] [26]

IEEE TPAMI , volume =

Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy , author =. IEEE TPAMI , volume =. 2005 , publisher =

work page 2005

[27] [27]

Applied Intelligence , volume =

Feature selection based on mutual information with correlation coefficient , author =. Applied Intelligence , volume =. 2020 , doi =

work page 2020

[28] [28]

1999 , school =

Correlation-based feature selection for machine learning , author =. 1999 , school =

work page 1999

[29] [29]

Xiaofei He and Deng Cai and Partha Niyogi , title =

work page

[30] [30]

IEEE TPAMI , volume =

Unsupervised feature selection using feature similarity , author =. IEEE TPAMI , volume =. 2002 , Xdoi =

work page 2002

[31] [31]

Pairwise dependence-based unsupervised feature selection , journal =

Hyunki Lim and Dae. Pairwise dependence-based unsupervised feature selection , journal =. 2021 , Xurls =

work page 2021

[32] [32]

Pattern Recognit

Pei Huang and Xiaowei Yang , title =. Pattern Recognit. , volume =. 2022 , Xurls =

work page 2022

[33] [33]

Pawan Kumar and Benjamin Packer and Daphne Koller , editors =

M. Pawan Kumar and Benjamin Packer and Daphne Koller , editors =. Self-Paced Learning for Latent Variable Models , booktitle =. 2010 , urls =

work page 2010

[34] [34]

Pattern Recognit

Wei Zheng and Xiaofeng Zhu and Guoqiu Wen and Yonghua Zhu and Hao Yu and Jiangzhang Gan , title =. Pattern Recognit. Lett. , volume =. 2020 , urls =. doi:10.1016/J.PATREC.2018.06.029 , timestamp =

work page doi:10.1016/j.patrec.2018.06.029 2020

[35] [35]

Orthogonally constrained matrix factorization for robust unsupervised feature selection with local preserving , Xjournal =

Chuan Luo and Jian Zheng and Tianrui Li and Hongmei Chen and Yanyong Huang and Xi Peng , keywords =. Orthogonally constrained matrix factorization for robust unsupervised feature selection with local preserving , Xjournal =. Inf. Sci. , volume =. 2022 , issn =. doi:https://doi.org/10.1016/j.ins.2021.11.068 , urls =

work page doi:10.1016/j.ins.2021.11.068 2022

[36] [36]

Mohsen Ghassemi Parsa and Hadi Zare and Mehdi Ghatee , title =. Eng. Appl. Artif. Intell. , volume =. 2020 , Xurls =

work page 2020

[37] [37]

Haifeng Zhao and Qi Li and Zheng Wang and Feiping Nie , title =. Cogn. Comput. , volume =. 2022 , urls =. doi:10.1007/S12559-021-09875-0 , timestamp =

work page doi:10.1007/s12559-021-09875-0 2022

[38] [38]

Unsupervised feature selection via dual space-based low redundancy scores and extended OLSDA , Xjournal =

Duanzhang Li and Hongmei Chen and Yong Mi and Chuan Luo and Shi-Jinn Horng and Tianrui Li , keywords =. Unsupervised feature selection via dual space-based low redundancy scores and extended OLSDA , Xjournal =. Inf. Sci. , volume =. 2024 , issn =. doi:https://doi.org/10.1016/j.ins.2024.120227 , urls =

work page doi:10.1016/j.ins.2024.120227 2024

[39] [39]

Unsupervised feature selection guided by orthogonal representation of feature space , journal =

Mahsa Samareh Jahani and Gholamreza Aghamollaei and Mahdi Eftekhari and Farid Saberi. Unsupervised feature selection guided by orthogonal representation of feature space , journal =. 2023 , Xurls =

work page 2023

[40] [40]

Unsupervised feature selection based on variance-covariance subspace distance , journal =

Saeed Karami and Farid Saberi. Unsupervised feature selection based on variance-covariance subspace distance , journal =. 2023 , urls =

work page 2023

[41] [42]

Zhiwen Cao and Xijiong Xie and Feixiang Sun and Jiabei Qian , title =. Knowl. Based Syst. , volume =. 2023 , urls =. doi:10.1016/J.KNOSYS.2023.110578 , timestamp =

work page doi:10.1016/j.knosys.2023.110578 2023

[42] [43]

Hadi and Bernard Cosgrave and Susan McKeever , title =

Ayman Taha and Ali S. Hadi and Bernard Cosgrave and Susan McKeever , title =. Expert Syst. Appl. , volume =. 2023 , urls =. doi:10.1016/J.ESWA.2022.118718 , timestamp =

work page doi:10.1016/j.eswa.2022.118718 2023

[43] [44]

Neurocomputing , volume =

Ronghua Shang and Jiarui Kong and Lujuan Wang and Weitong Zhang and Chao Wang and Yangyang Li and Licheng Jiao , title =. Neurocomputing , volume =. 2023 , Xurls =

work page 2023

[44] [45]

Tong Liu and Rongyao Hu and Yongxin Zhu , title =. Multim. Tools Appl. , volume =. 2023 , urls =. doi:10.1007/S11042-022-13903-Y , timestamp =

work page doi:10.1007/s11042-022-13903-y 2023

[45] [46]

Pattern Recognit

Mengbo You and Aihong Yuan and Dongjian He and Xuelong Li , title =. Pattern Recognit. , volume =. 2023 , Xurls =

work page 2023

[46] [47]

2023 , urls =

Dan Shi and Lei Zhu and Jingjing Li and Zheng Zhang and Xiaojun Chang , title =. 2023 , urls =. doi:10.1109/TIP.2023.3234497 , timestamp =

work page doi:10.1109/tip.2023.3234497 2023

[47] [48]

Goodfellow and Jean Pouget

Ian J. Goodfellow and Jean Pouget. Generative Adversarial Nets , booktitle =. 2014 , urls =

work page 2014

[48] [49]

R. J. Mach. Learn. Res. , volume =. 2021 , urls =

work page 2021

[49] [50]

Sinkhorn Distances: Lightspeed Computation of Optimal Transport , booktitle =

Marco Cuturi , editors =. Sinkhorn Distances: Lightspeed Computation of Optimal Transport , booktitle =. 2013 , urls =

work page 2013

[50] [51]

Modeling Tabular data using Conditional

Lei Xu and Maria Skoularidou and Alfredo Cuesta. Modeling Tabular data using Conditional. Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada , Xpages =. 2019 , urls =

work page 2019

[51] [52]

Measuring the Stability of Feature Selection , booktitle =

Sarah Nogueira and Gavin Brown , Xeditors =. Measuring the Stability of Feature Selection , booktitle =. 2016 , Xurls =

work page 2016

[52] [53]

Papadimitriou and Kenneth Steiglitz , title =

Christos H. Papadimitriou and Kenneth Steiglitz , title =. 1982 , isbn =

work page 1982

[53] [54]

2015 , urls =

Xiaochun Cao and Changqing Zhang and Huazhu Fu and Si Liu and Hua Zhang , title =. 2015 , urls =. doi:10.1109/CVPR.2015.7298657 , timestamp =

work page doi:10.1109/cvpr.2015.7298657 2015

[54] [55]

Alexander Strehl and Joydeep Ghosh , title =. J. Mach. Learn. Res. , volume =. 2002 , urls =

work page 2002

[55] [56]

ACM Computing Surveys (CSUR) , volume=

Feature selection: A data perspective , author=. ACM Computing Surveys (CSUR) , volume=. 2018 , publisher=

work page 2018

[56] [57]

k-means++: the advantages of careful seeding , booktitle =

David Arthur and Sergei Vassilvitskii , editors =. k-means++: the advantages of careful seeding , booktitle =. 2007 , urls =

work page 2007

[57] [58]

Leo Breiman and J. H. Friedman and Richard A. Olshen and C. J. Stone , title =. 1984 , isbn =

work page 1984

[58] [59]

2021 , Xurls =

Xinxing Wu and Qiang Cheng , title =. 2021 , Xurls =

work page 2021

[59] [60]

2011 , urls =

Nicolas Bonneel and Michiel van de Panne and Sylvain Paris and Wolfgang Heidrich , title =. 2011 , urls =. doi:10.1145/2070781.2024192 , timestamp =

work page doi:10.1145/2070781.2024192 2011

[60] [61]

Guibas , title =

Yossi Rubner and Carlo Tomasi and Leonidas J. Guibas , title =. Int. J. Comput. Vis. , volume =. 2000 , urls =. doi:10.1023/A:1026543900054 , timestamp =

work page doi:10.1023/a:1026543900054 2000

[61] [62]

The Concentration of Fractional Distances , journal =

Damien Fran. The Concentration of Fractional Distances , journal =

work page

[62] [63]

Beyer and Jonathan Goldstein and Raghu Ramakrishnan and Uri Shaft , title =

Kevin S. Beyer and Jonathan Goldstein and Raghu Ramakrishnan and Uri Shaft , title =

work page

[63] [64]

Durrant and Ata Kab

Robert J. Durrant and Ata Kab. When is `nearest neighbour' meaningful:. J. Complex. , volume =

work page

[64] [65]

Houle and Hans

Michael E. Houle and Hans. Can Shared-Neighbor Distances Defeat the Curse of Dimensionality? , booktitle =

work page

[65] [66]

Houle and Marie Kiermeier and Arthur Zimek , editor =

Michael E. Houle and Marie Kiermeier and Arthur Zimek , editor =. Clustering High-Dimensional Data , booktitle =. 2023 , doi =

work page 2023

[66] [67]

Outlier Detection in Arbitrarily Oriented Subspaces , booktitle =

Hans. Outlier Detection in Arbitrarily Oriented Subspaces , booktitle =

work page

[67] [68]

Alastair Anderberg and James Bailey and Ricardo J. G. B. Campello and Michael E. Houle and Henrique O. Marques and Milos Radovanovic and Arthur Zimek , title =

work page

[68] [69]

Recent methods for dimensionality reduction:

Diego Hern. Recent methods for dimensionality reduction:

work page

[69] [70]

Technology challenges for the

Jones, Dayton L , booktitle=. Technology challenges for the. 2010 , organization=

work page 2010

[70] [71]

M. E. Houle , title =. SISAP , year =

work page

[71] [72]

M. E. Houle and Vincent Oria and Arwa M. Wali , title =. SISAP , year =

work page

[72] [73]

STOC , Xpages=

Finding nearest neighbors in growth-restricted metrics , author=. STOC , Xpages=. 2002 , organization=

work page 2002

[73] [74]

SODA , Xpages=

Navigating nets: simple algorithms for proximity search , author=. SODA , Xpages=. 2004 , organization=

work page 2004

[74] [75]

ICML , Xpages=

Cover trees for nearest neighbor , author=. ICML , Xpages=. 2006 , organization=

work page 2006

[75] [76]

JMLR , volume=

An introduction to variable and feature selection , author=. JMLR , volume=. 2003 , publisher=

work page 2003

[76] [77]

TKDE , volume=

Efficient biased sampling for approximate clustering and outlier detection in large data sets , author=. TKDE , volume=. 2003 , publisher=

work page 2003

[77] [78]

CCC , Xpages=

An improved density-based cluster analysis method combining genetic algorithm and data sampling for large-scale datasets , author=. CCC , Xpages=. 2013 , organization=

work page 2013

[78] [79]

ICMLA , volume=

Randomized sampling for large data applications of SVM , author=. ICMLA , volume=. 2012 , organization=

work page 2012

[79] [80]

ACSAC , Xpages=

Parallelization of spectral clustering algorithm on multi-core processors and GPGPU , author=. ACSAC , Xpages=. 2008 , organization=

work page 2008

[80] [81]

Big Data , Xpages=

Evaluating parallel logistic regression models , author=. Big Data , Xpages=. 2013 , organization=

work page 2013