Improving Performance in Classification Tasks with LCEN and the Weighted Focal Differentiable MCC Loss

Pedro Seber; Richard D. Braatz

arxiv: 2604.21252 · v1 · submitted 2026-04-23 · 💻 cs.LG

Improving Performance in Classification Tasks with LCEN and the Weighted Focal Differentiable MCC Loss

Pedro Seber , Richard D. Braatz This is my paper

Pith reviewed 2026-05-09 23:10 UTC · model grok-4.3

classification 💻 cs.LG

keywords LCENfeature selectionclassificationsparsityinterpretable modelsdifferentiable MCC lossmacro F1 scoreMatthews correlation coefficient

0 comments

The pith

LCEN adapted for classification keeps models sparse and interpretable while diffMCC loss raises macro F1 by 4.9 percent and MCC by 8.5 percent over weighted cross-entropy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper adapts the LASSO-Clip-EN algorithm from regression to classification, preserving its ability to produce nonlinear yet interpretable models that discard unnecessary input features. On four standard binary and multiclass datasets, the modified LCEN eliminates an average of 56 percent of features and still matches or exceeds the macro F1 and MCC scores of ten competing model types. When other models are retrained solely on the LCEN-selected features, performance improves significantly in three experiments. At the same time, replacing the usual weighted cross-entropy objective with the weighted focal differentiable MCC loss produces the highest scores in every trial. These results indicate that sparsity and a better-suited loss can be achieved together without sacrificing accuracy on the tested tasks.

Core claim

A classification-ready version of LCEN performs nonlinear interpretable feature selection, discards 56 percent of inputs on average, and yields competitive or superior macro F1 and MCC values; the same experiments show that training with the weighted focal differentiable MCC loss consistently beats weighted cross-entropy by average margins of 4.9 percent in F1 and 8.5 percent in MCC.

What carries the argument

Modified LCEN algorithm that extends LASSO-Clip-EN feature selection to classification while enforcing sparsity and interpretability, together with the weighted focal differentiable MCC loss used as a training objective.

If this is right

LCEN models remain sparse enough to eliminate roughly half the input features while matching or exceeding the accuracy of non-sparse alternatives.
Retraining any model on only the LCEN-chosen features produces statistically significant gains in three of the four experiments.
The diffMCC loss is the top performer in every experiment and delivers measurable lifts in both macro F1 and MCC.
Feature selection by LCEN can be combined with standard retraining to improve results without increasing model complexity.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the sparsity pattern generalizes, LCEN could be applied upstream of any classifier to reduce data-collection costs in high-dimensional settings such as sensor arrays or genomics.
The consistent advantage of diffMCC suggests it may serve as a drop-in replacement for cross-entropy in other imbalanced or multi-class problems where correlation-based metrics matter.
Because LCEN-selected features improve downstream models, the method could be used as a diagnostic tool to identify which variables truly drive class separation.

Load-bearing premise

The four standard datasets used are representative enough that the observed sparsity and accuracy gains will hold for other real-world classification problems without extra tuning.

What would settle it

A new dataset drawn from a different domain in which LCEN models either retain fewer than 30 percent of features while losing accuracy or in which diffMCC-trained models fail to exceed weighted cross-entropy performance would contradict the reported pattern.

read the original abstract

The LASSO-Clip-EN (LCEN) algorithm was previously introduced for nonlinear, interpretable feature selection and machine learning. However, its design and use was limited to regression tasks. In this work, we create a modified version of the LCEN algorithm that is suitable for classification tasks and maintains its desirable properties, such as interpretability. This modified LCEN algorithm is evaluated on four widely used binary and multiclass classification datasets. In these experiments, LCEN is compared against 10 other model types and consistently reaches high test-set macro F$_1$ score and Matthews correlation coefficient (MCC) metrics, higher than that of the majority of investigated models. LCEN models for classification remain sparse, eliminating an average of 56% of all input features in the experiments performed. Furthermore, LCEN-selected features are used to retrain all models using the same data, leading to statistically significant performance improvements in three of the experiments and insignificant differences in the fourth when compared to using all features or other feature selection methods. Simultaneously, the weighted focal differentiable MCC (diffMCC) loss function is evaluated on the same datasets. Models trained with the diffMCC loss function are always the best-performing methods in these experiments, and reach test-set macro F$_1$ scores that are, on average, 4.9% higher and MCCs that are 8.5% higher than those obtained by models trained with the weighted cross-entropy loss. These results highlight the performance of LCEN as a feature selection and machine learning algorithm also for classification tasks, and how the diffMCC loss function can train very accurate models, surpassing the weighted cross-entropy loss in the tasks investigated.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

LCEN adapts to classification with sparsity intact and the new diffMCC loss beats weighted cross-entropy on four datasets, but the reported gains lack run-to-run variance or significance tests.

read the letter

This paper adapts the earlier LCEN feature selection method to classification and adds a weighted focal differentiable MCC loss. The LCEN version keeps the sparsity property, cutting features by 56% on average across the four datasets, and the selected features sometimes improve downstream model performance with statistical significance in three cases. The diffMCC loss is shown as the top performer every time, delivering average gains of 4.9% macro F1 and 8.5% MCC over weighted cross-entropy. Those are the concrete pieces worth noting: a working extension of LCEN plus a loss that directly targets the MCC metric in a differentiable way for classification. The experiments compare against ten other model types on standard binary and multiclass sets, which is a reasonable scope for an empirical paper. The sparsity and feature-selection results look like the stronger part because they include a significance check. The loss comparison is weaker because the abstract gives no standard deviations, no count of independent runs, and no p-values for the 4.9% and 8.5% margins. That makes it hard to judge whether the edge comes from the loss itself or from differences in tuning or random seeds. Model architectures and hyperparameter search details are also missing, so replication would require guesswork. The work is incremental rather than foundational, but the practical angle on sparse interpretable classifiers plus a metric-focused loss could matter for people who already use LCEN or need better handling of imbalanced classification. It is worth sending to referees so they can request the missing variance numbers, run counts, and architecture specs. The central claims are testable once those details are added.

Referee Report

2 major / 2 minor

Summary. The manuscript extends the LASSO-Clip-EN (LCEN) algorithm to classification tasks while preserving interpretability and sparsity, evaluating the modified LCEN against 10 other models on four binary and multiclass datasets. It reports that LCEN eliminates an average of 56% of input features, that retraining with LCEN-selected features yields statistically significant performance gains in three of four experiments, and that the weighted focal differentiable MCC (diffMCC) loss consistently produces the best results, with average gains of 4.9% in macro F1 and 8.5% in MCC over weighted cross-entropy.

Significance. If the empirical claims hold under rigorous statistical controls, the work would be significant for supplying an interpretable sparse feature selector for classification and a differentiable loss that demonstrably improves upon cross-entropy. The reported sparsity and consistent outperformance could be useful in domains requiring both accuracy and feature transparency. The absence of variance estimates and reproducibility details, however, currently limits the strength of this contribution.

major comments (2)

[Abstract] Abstract: The central claim that diffMCC-trained models are 'always the best-performing' and deliver fixed average improvements (4.9% F1, 8.5% MCC) over weighted cross-entropy is presented without standard deviations, number of independent runs, or any statistical test. This omission is load-bearing because the abstract contrasts it with the feature-selection results, which are explicitly labeled statistically significant; without these controls the headline margins cannot be distinguished from run-to-run variability or unequal hyperparameter effort.
[Experiments] Experiments section: No information is supplied on the precise architectures of the 10 comparator models, the hyperparameter search protocol, the cross-validation scheme, or the exact statistical tests used for the performance comparisons. These details are required to evaluate whether the reported superiority of LCEN and diffMCC is reproducible and not an artifact of implementation choices.

minor comments (2)

[Abstract] The abstract states that LCEN 'remains sparse' and eliminates 56% of features on average; a table or figure quantifying per-dataset sparsity and the precise definition of 'eliminated' features would improve clarity.
Ensure that all reported averages in tables are accompanied by standard deviations and the number of runs; this is a presentation issue that does not affect the core claims but is needed for completeness.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their thoughtful and constructive review. We address each major comment below and have revised the manuscript to incorporate additional details and statistical reporting where feasible.

read point-by-point responses

Referee: [Abstract] Abstract: The central claim that diffMCC-trained models are 'always the best-performing' and deliver fixed average improvements (4.9% F1, 8.5% MCC) over weighted cross-entropy is presented without standard deviations, number of independent runs, or any statistical test. This omission is load-bearing because the abstract contrasts it with the feature-selection results, which are explicitly labeled statistically significant; without these controls the headline margins cannot be distinguished from run-to-run variability or unequal hyperparameter effort.

Authors: We agree that the abstract would be strengthened by including measures of variability and experimental repetition details for the diffMCC results. We have revised the abstract to report the average improvements together with their standard deviations across repeated runs and to note the statistical tests performed. This change ensures the reporting is consistent with the statistically significant feature-selection claims and allows readers to assess the robustness of the observed margins. revision: yes
Referee: [Experiments] Experiments section: No information is supplied on the precise architectures of the 10 comparator models, the hyperparameter search protocol, the cross-validation scheme, or the exact statistical tests used for the performance comparisons. These details are required to evaluate whether the reported superiority of LCEN and diffMCC is reproducible and not an artifact of implementation choices.

Authors: We appreciate the referee's emphasis on reproducibility. We have expanded the Experiments section to provide the requested information, including descriptions of the comparator model architectures, the hyperparameter search protocol and ranges, the cross-validation procedure, and the exact statistical tests used for all performance comparisons. These additions directly address concerns about potential implementation artifacts. revision: yes

Circularity Check

0 steps flagged

No circularity detected in empirical model comparisons or loss evaluations.

full rationale

The paper extends the prior LCEN algorithm (introduced for regression) to classification tasks via a described modification, then reports experimental results on four fixed datasets against 10 other models, plus comparisons of diffMCC loss versus weighted cross-entropy. No mathematical derivation chain exists; claims rest on direct performance metrics (F1, MCC), sparsity counts, and limited statistical tests for feature selection only. Self-citation of the original LCEN paper is present but non-load-bearing, as the new classification results and loss-function gains are independently measured on held-out test sets rather than derived from or fitted to the cited work. No self-definitional reductions, fitted inputs renamed as predictions, or ansatz smuggling occur.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The abstract provides no explicit free parameters, axioms, or invented entities. The work rests on standard supervised learning assumptions that the chosen datasets are representative and that F1 and MCC are appropriate summary metrics.

pith-pipeline@v0.9.0 · 5612 in / 1167 out tokens · 49564 ms · 2026-05-09T23:10:30.067356+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

71 extracted references · 71 canonical work pages

[1]

2025 , url=

Pedro Seber and Richard Braatz , journal=. 2025 , url=

work page 2025
[2]

Proceedings of the ACM on Human-Computer Interaction , month =

Hong, Sungsoo Ray and Hullman, Jessica and Bertini, Enrico , title =. Proceedings of the ACM on Human-Computer Interaction , month =. 2020 , issue_date =

work page 2020
[3]

Notes on the n-Person Game --

Lloyd Stowell Shapley , journal =. Notes on the n-Person Game --

work page
[4]

Why Should I Trust You?

Ribeiro, Marco Tulio and Singh, Sameer and Guestrin, Carlos , title =. 2016 , isbn =. doi:10.1145/2939672.2939778 , booktitle =

work page doi:10.1145/2939672.2939778 2016
[5]

Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead , volume =

Cynthia Rudin , url =. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead , volume =. Nature Machine Intelligence , months =

work page
[6]

, title =

Dasgupta, Anirban and Drineas, Petros and Harb, Boulos and Josifovski, Vanja and Mahoney, Michael W. , title =. 2007 , isbn =. doi:10.1145/1281192.1281220 , booktitle =

work page doi:10.1145/1281192.1281220 2007
[7]

and Brkić, K

Jović, A. and Brkić, K. and Bogunović, N. , booktitle=. A review of feature selection methods with applications , year=

work page
[8]

A Survey on Evolutionary Multiobjective Feature Selection in Classification: Approaches, Applications, and Challenges , year=

Jiao, Ruwang and Nguyen, Bach Hoai and Xue, Bing and Zhang, Mengjie , journal=. A Survey on Evolutionary Multiobjective Feature Selection in Classification: Approaches, Applications, and Challenges , year=

work page
[9]

Cybenko , doi =

G. Cybenko , doi =. Approximation by superpositions of a sigmoidal function , volume =. Mathematics of Control, Signals, and Systems , month =

work page
[10]

Multilayer feedforward networks are universal approximators

Multilayer feedforward networks are universal approximators , journal =. 1989 , _issn =. doi:https://doi.org/10.1016/0893-6080(89)90020-8 , url =

work page doi:10.1016/0893-6080(89)90020-8 1989
[11]

Approximation capabilities of multilayer feedforward networks

Approximation capabilities of multilayer feedforward networks , journal =. 1991 , issn =. doi:https://doi.org/10.1016/0893-6080(91)90009-T , url =

work page doi:10.1016/0893-6080(91)90009-t 1991
[12]

Predicting

Pedro Seber , year=. Predicting. 2402.17131 , archivePrefix=

work page arXiv
[13]

, elocation-id =

Seber, Pedro and Braatz, Richard D. , elocation-id =. Machine-Learning-Based Prediction of. 2025 , doi =

work page 2025
[14]

, journal=

Akaike, H. , journal=. A new look at the statistical model identification , year=

work page
[15]

The Annals of Statistics , number =

Gideon Schwarz , title =. The Annals of Statistics , number =. 1978 , _doi =

work page 1978
[16]

, title =

Santosa, Fadil and Symes, William W. , title =. 1986 , _doi =

work page 1986
[17]

Journal of the Royal Statistical Society: Series B (Methodological) , volume =

Tibshirani, Robert , title =. Journal of the Royal Statistical Society: Series B (Methodological) , volume =

work page
[18]

Proceedings of The 27th International Conference on Artificial Intelligence and Statistics , pages =

Sparse and Faithful Explanations Without Sparse Models , author =. Proceedings of The 27th International Conference on Artificial Intelligence and Statistics , pages =. 2024 , editor =

work page 2024
[19]

Biometrical Journal , volume =

Heinze, Georg and Wallisch, Christine and Dunkler, Daniela , title =. Biometrical Journal , volume =

work page
[20]

and Stephens, Philip A

Whittingham, Mark J. and Stephens, Philip A. and Bradbury, Richard B. and Freckleton, Robert P. , title =. Journal of Animal Ecology , volume =

work page
[21]

Step away from stepwise , volume =

Gary Smith , url =. Step away from stepwise , volume =. Journal of Big Data , months =

work page
[22]

2015 , _issn =

Variable selection and corporate bankruptcy forecasts , journal =. 2015 , _issn =

work page 2015
[23]

Journal of the Royal Statistical Society Series B: Statistical Methodology , volume =

Zou, Hui and Hastie, Trevor , title = ". Journal of the Royal Statistical Society Series B: Statistical Methodology , volume =. 2005 , months =

work page 2005
[24]

How Correlations Influence Lasso Prediction , year=

Hebiri, Mohamed and Lederer, Johannes , journal=. How Correlations Influence Lasso Prediction , year=

work page
[25]

Dalalyan and Mohamed Hebiri and Johannes Lederer , title =

Arnak S. Dalalyan and Mohamed Hebiri and Johannes Lederer , title =. Bernoulli , number =. 2017 , _doi =

work page 2017
[26]

Statistics in Medicine , volume =

Pavlou, Menelaos and Ambler, Gareth and Seaman, Shaun and De Iorio, Maria and Omar, Rumana Z , title =. Statistics in Medicine , volume =

work page
[27]

FFX: Fast, Scalable, Deterministic Symbolic Regression Technology

McConaghy, Trent. FFX: Fast, Scalable, Deterministic Symbolic Regression Technology. Genetic Programming Theory and Practice IX. 2011

work page 2011
[28]

2017 , _note =

Enabling reduced-order data-driven nonlinear identification and modeling through naïve elastic net regularization , journal =. 2017 , _note =

work page 2017
[29]

2020 , _issn =

Computers & Chemical Engineering , volume =. 2020 , _issn =

work page 2020
[30]

Journal of the American Statistical Association , volume =

Jianqing Fan and Runze Li , title =. Journal of the American Statistical Association , volume =. 2001 , publisher =

work page 2001
[31]

Journal of the Royal Statistical Society Series B: Statistical Methodology , volume =

Ravikumar, Pradeep and Lafferty, John and Liu, Han and Wasserman, Larry , title = ". Journal of the Royal Statistical Society Series B: Statistical Methodology , volume =. 2009 , _month =

work page 2009
[32]

Proceedings of The 25th International Conference on Artificial Intelligence and Statistics , pages =

Fast Sparse Classification for Generalized Linear and Additive Models , author =. Proceedings of The 25th International Conference on Artificial Intelligence and Statistics , pages =. 2022 , editor =

work page 2022
[33]

The Annals of Statistics , number =

Hui Zou and Hao Helen Zhang , title =. The Annals of Statistics , number =. 2009 , _doi =

work page 2009
[34]

The Annals of Statistics , number =

Cun-Hui Zhang , title =. The Annals of Statistics , number =. 2010 , _doi =

work page 2010
[35]

Electronic Journal of Statistics , _number =

Sara van de Geer and Peter B. Electronic Journal of Statistics , _number =. 2011 , _doi =

work page 2011
[36]

Journal of Computational Biology , volume =

De Mol, Christine and Mosci, Sofia and Traskine, Magali and Verri, Alessandro , title =. Journal of Computational Biology , volume =. 2009 , _doi =

work page 2009
[37]

Ultra High-Dimensional Nonlinear Feature Selection for Big Biological Data , year=

Yamada, Makoto and Tang, Jiliang and Lugo-Martinez, Jose and Hodzic, Ermin and Shrestha, Raunak and Saha, Avishek and Ouyang, Hua and Yin, Dawei and Mamitsuka, Hiroshi and Sahinalp, Cenk and Radivojac, Predrag and Menczer, Filippo and Chang, Yi , journal=. Ultra High-Dimensional Nonlinear Feature Selection for Big Biological Data , year=

work page
[38]

The Annals of Statistics , number =

Dimitris Bertsimas and Bart Van Parys , title =. The Annals of Statistics , number =. 2020 , _doi =

work page 2020
[39]

Xu, Kai and Srivastava, Akash and Gutfreund, Dan and Sosa, Felix and Ullman, Tomer and Tenenbaum, Josh and Sutton, Charles , booktitle =. A

work page
[40]

Consistent feature selection for analytic deep neural networks , url =

Dinh, Vu C and Ho, Lam S , booktitle =. Consistent feature selection for analytic deep neural networks , url =

work page
[41]

Group sparse regularization for deep neural networks , volume=

Scardapane, Simone and Comminiello, Danilo and Hussain, Amir and Uncini, Aurelio , year=. Group sparse regularization for deep neural networks , volume=. Neurocomputing , publisher=

work page
[42]

Pal , url =

Jian Wang and Huaqing Zhang and Junze Wang and Yifei Pu and Nikhil R. Pal , url =. Feature Selection Using a Neural Network With Group. IEEE Transactions on Neural Networks and Learning Systems , _month =

work page
[43]

2021 , _editor =

Lemhadri, Ismael and Ruan, Feng and Tibshirani, Rob , booktitle =. 2021 , _editor =

work page 2021
[44]

Heterogeneous Feature Selection With Multi-Modal Deep Neural Networks and Sparse Group

Zhao, Lei and Hu, Qinghua and Wang, Wenwu , journal=. Heterogeneous Feature Selection With Multi-Modal Deep Neural Networks and Sparse Group. 2015 , volume=

work page 2015
[45]

Proceedings of the 36th International Conference on Machine Learning , pages =

Concrete Autoencoders: Differentiable Feature Selection and Reconstruction , author =. Proceedings of the 36th International Conference on Machine Learning , pages =. 2019 , _editor =

work page 2019
[46]

2020 , _issn =

Deep feature selection using a teacher-student network , journal =. 2020 , _issn =. doi:https://doi.org/10.1016/j.neucom.2019.12.017 , url =

work page doi:10.1016/j.neucom.2019.12.017 2020
[47]

Proceedings of the 37th International Conference on Machine Learning , pages =

Feature Selection using Stochastic Gates , author =. Proceedings of the 37th International Conference on Machine Learning , pages =. 2020 , _editor =

work page 2020
[48]

Patch Shortcuts: Interpretable Proxy Models Efficiently Find Black-Box Vulnerabilities , year=

Rosenzweig, Julia and Sicking, Joachim and Houben, Sebastian and Mock, Michael and Akila, Maram , booktitle=. Patch Shortcuts: Interpretable Proxy Models Efficiently Find Black-Box Vulnerabilities , year=

work page
[49]

Unmasking

Sebastian Lapuschkin and Stephan Wäldchen and Alexander Binder and Grégoire Montavon and Wojciech Samek and Klaus-Robert Müller , doi =. Unmasking. Nature Communications , month =

work page
[50]

Tukey , journal =

John W. Tukey , journal =. Comparing Individual Means in the Analysis of Variance , _urldate =

work page
[51]

Thresholding Procedures for High Dimensional Variable Selection and Statistical Estimation , url =

Zhou, Shuheng , booktitle =. Thresholding Procedures for High Dimensional Variable Selection and Statistical Estimation , url =

work page
[52]

The Annals of Statistics , number =

Nicolai Meinshausen and Bin Yu , title =. The Annals of Statistics , number =. 2009 , _doi =

work page 2009
[53]

2010 , eprint=

Thresholded Lasso for high dimensional variable selection and statistical estimation , author=. 2010 , eprint=

work page 2010
[54]

Bernoulli , number =

Alexandre Belloni and Victor Chernozhukov , title =. Bernoulli , number =. 2013 , _doi =

work page 2013
[55]

1995 , url=

Extending and Benchmarking Cascade-Correlation: Extensions to the Cascade-Correlation Architecture and Benchmarking of Feed-forward Supervised Artificial Neural Networks , author=. 1995 , url=

work page 1995
[56]

Tikhonov, A. N. Solution of incorrectly formulated problems and the regularization method. Doklady Akademii Nauk SSSR. 1963

work page 1963
[57]

Quantitative Sociology , publisher =

11 -. Quantitative Sociology , publisher =. 1975 , _series =

work page 1975
[58]

Journal of Applied Probability , author=

Soft Modelling by Latent Variables: The Non-Linear Iterative Partial Least Squares (. Journal of Applied Probability , author=. 1975 , pages=

work page 1975
[59]

Random decision forests , year=

Tin Kam Ho , booktitle=. Random decision forests , year=

work page
[60]

Friedman , title =

Jerome H. Friedman , title =. The Annals of Statistics , number =. 2001 , _doi =

work page 2001
[61]

1997 , _issn =

A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting , journal =. 1997 , _issn =

work page 1997
[62]

and Guyon, Isabelle M

Boser, Bernhard E. and Guyon, Isabelle M. and Vapnik, Vladimir N. , title =. Proceedings of the Fifth Annual Workshop on Computational Learning Theory , pages =. 1992 , _isbn =

work page 1992
[63]

Survival analysis of heart failure patients: A case study , volume =

Tanvir Ahmad and Assia Munir and Sajjad Haider Bhatti and Muhammad Aftab and Muhammad Ali Raza , doi =. Survival analysis of heart failure patients: A case study , volume =. PLOS ONE , month =

work page
[64]

Machine learning can predict survival of patients with heart failure from serum creatinine and ejection fraction alone , volume =

Davide Chicco and Giuseppe Jurman , doi =. Machine learning can predict survival of patients with heart failure from serum creatinine and ejection fraction alone , volume =. BMC Medical Informatics and Decision Making , month =

work page
[65]

Decision Support Systems62, 22–31 (2014) https://doi.org/ 10.1016/j.dss.2014.03.001 26

A data-driven approach to predict the success of bank telemarketing , journal =. 2014 , _issn =. doi:https://doi.org/10.1016/j.dss.2014.03.001 , url =

work page doi:10.1016/j.dss.2014.03.001 2014
[66]

Cortez, A

Modeling wine preferences by data mining from physicochemical properties , journal =. 2009 , _issn =. doi:https://doi.org/10.1016/j.dss.2009.05.016 , url =

work page doi:10.1016/j.dss.2009.05.016 2009
[67]

German , title =

B. German , title =. 1987 , type =

work page 1987
[68]

1973 , _issn =

The Use of Spark Source Mass Spectrometry for the Analysis of Glass Fragments Encountered in Forensic Applications, Part 2 , journal =. 1973 , _issn =. doi:https://doi.org/10.1016/S0015-7368(73)70826-4 , url =

work page doi:10.1016/s0015-7368(73)70826-4 1973
[69]

1974 , _issn =

A Report on an Investigation into the Trace Elements Present in Vehicle Headlamp and Auxiliary Lamp Glasses , journal =. 1974 , _issn =. doi:https://doi.org/10.1016/S0015-7368(74)70850-7 , url =

work page doi:10.1016/s0015-7368(74)70850-7 1974
[70]

Grace C. Y. Peng and Mark Alber and Adrian Buganza Tepole and William R. Cannon and Suvranu De and Savador Dura-Bernal and Krishna Garikipati and George Karniadakis and William W. Lytton and Paris Perdikaris and Linda Petzold and Ellen Kuhl , url =. Multiscale Modeling Meets Machine Learning: What Can We Learn? , volume =. Archives of Computational Method...

work page
[71]

ACM Comput

Willard, Jared and Jia, Xiaowei and Xu, Shaoming and Steinbach, Michael and Kumar, Vipin , title =. ACM Comput. Surv. , month =. 2022 , issue_date =

work page 2022

[1] [1]

2025 , url=

Pedro Seber and Richard Braatz , journal=. 2025 , url=

work page 2025

[2] [2]

Proceedings of the ACM on Human-Computer Interaction , month =

Hong, Sungsoo Ray and Hullman, Jessica and Bertini, Enrico , title =. Proceedings of the ACM on Human-Computer Interaction , month =. 2020 , issue_date =

work page 2020

[3] [3]

Notes on the n-Person Game --

Lloyd Stowell Shapley , journal =. Notes on the n-Person Game --

work page

[4] [4]

Why Should I Trust You?

Ribeiro, Marco Tulio and Singh, Sameer and Guestrin, Carlos , title =. 2016 , isbn =. doi:10.1145/2939672.2939778 , booktitle =

work page doi:10.1145/2939672.2939778 2016

[5] [5]

Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead , volume =

Cynthia Rudin , url =. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead , volume =. Nature Machine Intelligence , months =

work page

[6] [6]

, title =

Dasgupta, Anirban and Drineas, Petros and Harb, Boulos and Josifovski, Vanja and Mahoney, Michael W. , title =. 2007 , isbn =. doi:10.1145/1281192.1281220 , booktitle =

work page doi:10.1145/1281192.1281220 2007

[7] [7]

and Brkić, K

Jović, A. and Brkić, K. and Bogunović, N. , booktitle=. A review of feature selection methods with applications , year=

work page

[8] [8]

A Survey on Evolutionary Multiobjective Feature Selection in Classification: Approaches, Applications, and Challenges , year=

Jiao, Ruwang and Nguyen, Bach Hoai and Xue, Bing and Zhang, Mengjie , journal=. A Survey on Evolutionary Multiobjective Feature Selection in Classification: Approaches, Applications, and Challenges , year=

work page

[9] [9]

Cybenko , doi =

G. Cybenko , doi =. Approximation by superpositions of a sigmoidal function , volume =. Mathematics of Control, Signals, and Systems , month =

work page

[10] [10]

Multilayer feedforward networks are universal approximators

Multilayer feedforward networks are universal approximators , journal =. 1989 , _issn =. doi:https://doi.org/10.1016/0893-6080(89)90020-8 , url =

work page doi:10.1016/0893-6080(89)90020-8 1989

[11] [11]

Approximation capabilities of multilayer feedforward networks

Approximation capabilities of multilayer feedforward networks , journal =. 1991 , issn =. doi:https://doi.org/10.1016/0893-6080(91)90009-T , url =

work page doi:10.1016/0893-6080(91)90009-t 1991

[12] [12]

Predicting

Pedro Seber , year=. Predicting. 2402.17131 , archivePrefix=

work page arXiv

[13] [13]

, elocation-id =

Seber, Pedro and Braatz, Richard D. , elocation-id =. Machine-Learning-Based Prediction of. 2025 , doi =

work page 2025

[14] [14]

, journal=

Akaike, H. , journal=. A new look at the statistical model identification , year=

work page

[15] [15]

The Annals of Statistics , number =

Gideon Schwarz , title =. The Annals of Statistics , number =. 1978 , _doi =

work page 1978

[16] [16]

, title =

Santosa, Fadil and Symes, William W. , title =. 1986 , _doi =

work page 1986

[17] [17]

Journal of the Royal Statistical Society: Series B (Methodological) , volume =

Tibshirani, Robert , title =. Journal of the Royal Statistical Society: Series B (Methodological) , volume =

work page

[18] [18]

Proceedings of The 27th International Conference on Artificial Intelligence and Statistics , pages =

Sparse and Faithful Explanations Without Sparse Models , author =. Proceedings of The 27th International Conference on Artificial Intelligence and Statistics , pages =. 2024 , editor =

work page 2024

[19] [19]

Biometrical Journal , volume =

Heinze, Georg and Wallisch, Christine and Dunkler, Daniela , title =. Biometrical Journal , volume =

work page

[20] [20]

and Stephens, Philip A

Whittingham, Mark J. and Stephens, Philip A. and Bradbury, Richard B. and Freckleton, Robert P. , title =. Journal of Animal Ecology , volume =

work page

[21] [21]

Step away from stepwise , volume =

Gary Smith , url =. Step away from stepwise , volume =. Journal of Big Data , months =

work page

[22] [22]

2015 , _issn =

Variable selection and corporate bankruptcy forecasts , journal =. 2015 , _issn =

work page 2015

[23] [23]

Journal of the Royal Statistical Society Series B: Statistical Methodology , volume =

Zou, Hui and Hastie, Trevor , title = ". Journal of the Royal Statistical Society Series B: Statistical Methodology , volume =. 2005 , months =

work page 2005

[24] [24]

How Correlations Influence Lasso Prediction , year=

Hebiri, Mohamed and Lederer, Johannes , journal=. How Correlations Influence Lasso Prediction , year=

work page

[25] [25]

Dalalyan and Mohamed Hebiri and Johannes Lederer , title =

Arnak S. Dalalyan and Mohamed Hebiri and Johannes Lederer , title =. Bernoulli , number =. 2017 , _doi =

work page 2017

[26] [26]

Statistics in Medicine , volume =

Pavlou, Menelaos and Ambler, Gareth and Seaman, Shaun and De Iorio, Maria and Omar, Rumana Z , title =. Statistics in Medicine , volume =

work page

[27] [27]

FFX: Fast, Scalable, Deterministic Symbolic Regression Technology

McConaghy, Trent. FFX: Fast, Scalable, Deterministic Symbolic Regression Technology. Genetic Programming Theory and Practice IX. 2011

work page 2011

[28] [28]

2017 , _note =

Enabling reduced-order data-driven nonlinear identification and modeling through naïve elastic net regularization , journal =. 2017 , _note =

work page 2017

[29] [29]

2020 , _issn =

Computers & Chemical Engineering , volume =. 2020 , _issn =

work page 2020

[30] [30]

Journal of the American Statistical Association , volume =

Jianqing Fan and Runze Li , title =. Journal of the American Statistical Association , volume =. 2001 , publisher =

work page 2001

[31] [31]

Journal of the Royal Statistical Society Series B: Statistical Methodology , volume =

Ravikumar, Pradeep and Lafferty, John and Liu, Han and Wasserman, Larry , title = ". Journal of the Royal Statistical Society Series B: Statistical Methodology , volume =. 2009 , _month =

work page 2009

[32] [32]

Proceedings of The 25th International Conference on Artificial Intelligence and Statistics , pages =

Fast Sparse Classification for Generalized Linear and Additive Models , author =. Proceedings of The 25th International Conference on Artificial Intelligence and Statistics , pages =. 2022 , editor =

work page 2022

[33] [33]

The Annals of Statistics , number =

Hui Zou and Hao Helen Zhang , title =. The Annals of Statistics , number =. 2009 , _doi =

work page 2009

[34] [34]

The Annals of Statistics , number =

Cun-Hui Zhang , title =. The Annals of Statistics , number =. 2010 , _doi =

work page 2010

[35] [35]

Electronic Journal of Statistics , _number =

Sara van de Geer and Peter B. Electronic Journal of Statistics , _number =. 2011 , _doi =

work page 2011

[36] [36]

Journal of Computational Biology , volume =

De Mol, Christine and Mosci, Sofia and Traskine, Magali and Verri, Alessandro , title =. Journal of Computational Biology , volume =. 2009 , _doi =

work page 2009

[37] [37]

Ultra High-Dimensional Nonlinear Feature Selection for Big Biological Data , year=

Yamada, Makoto and Tang, Jiliang and Lugo-Martinez, Jose and Hodzic, Ermin and Shrestha, Raunak and Saha, Avishek and Ouyang, Hua and Yin, Dawei and Mamitsuka, Hiroshi and Sahinalp, Cenk and Radivojac, Predrag and Menczer, Filippo and Chang, Yi , journal=. Ultra High-Dimensional Nonlinear Feature Selection for Big Biological Data , year=

work page

[38] [38]

The Annals of Statistics , number =

Dimitris Bertsimas and Bart Van Parys , title =. The Annals of Statistics , number =. 2020 , _doi =

work page 2020

[39] [39]

Xu, Kai and Srivastava, Akash and Gutfreund, Dan and Sosa, Felix and Ullman, Tomer and Tenenbaum, Josh and Sutton, Charles , booktitle =. A

work page

[40] [40]

Consistent feature selection for analytic deep neural networks , url =

Dinh, Vu C and Ho, Lam S , booktitle =. Consistent feature selection for analytic deep neural networks , url =

work page

[41] [41]

Group sparse regularization for deep neural networks , volume=

Scardapane, Simone and Comminiello, Danilo and Hussain, Amir and Uncini, Aurelio , year=. Group sparse regularization for deep neural networks , volume=. Neurocomputing , publisher=

work page

[42] [42]

Pal , url =

Jian Wang and Huaqing Zhang and Junze Wang and Yifei Pu and Nikhil R. Pal , url =. Feature Selection Using a Neural Network With Group. IEEE Transactions on Neural Networks and Learning Systems , _month =

work page

[43] [43]

2021 , _editor =

Lemhadri, Ismael and Ruan, Feng and Tibshirani, Rob , booktitle =. 2021 , _editor =

work page 2021

[44] [44]

Heterogeneous Feature Selection With Multi-Modal Deep Neural Networks and Sparse Group

Zhao, Lei and Hu, Qinghua and Wang, Wenwu , journal=. Heterogeneous Feature Selection With Multi-Modal Deep Neural Networks and Sparse Group. 2015 , volume=

work page 2015

[45] [45]

Proceedings of the 36th International Conference on Machine Learning , pages =

Concrete Autoencoders: Differentiable Feature Selection and Reconstruction , author =. Proceedings of the 36th International Conference on Machine Learning , pages =. 2019 , _editor =

work page 2019

[46] [46]

2020 , _issn =

Deep feature selection using a teacher-student network , journal =. 2020 , _issn =. doi:https://doi.org/10.1016/j.neucom.2019.12.017 , url =

work page doi:10.1016/j.neucom.2019.12.017 2020

[47] [47]

Proceedings of the 37th International Conference on Machine Learning , pages =

Feature Selection using Stochastic Gates , author =. Proceedings of the 37th International Conference on Machine Learning , pages =. 2020 , _editor =

work page 2020

[48] [48]

Patch Shortcuts: Interpretable Proxy Models Efficiently Find Black-Box Vulnerabilities , year=

Rosenzweig, Julia and Sicking, Joachim and Houben, Sebastian and Mock, Michael and Akila, Maram , booktitle=. Patch Shortcuts: Interpretable Proxy Models Efficiently Find Black-Box Vulnerabilities , year=

work page

[49] [49]

Unmasking

Sebastian Lapuschkin and Stephan Wäldchen and Alexander Binder and Grégoire Montavon and Wojciech Samek and Klaus-Robert Müller , doi =. Unmasking. Nature Communications , month =

work page

[50] [50]

Tukey , journal =

John W. Tukey , journal =. Comparing Individual Means in the Analysis of Variance , _urldate =

work page

[51] [51]

Thresholding Procedures for High Dimensional Variable Selection and Statistical Estimation , url =

Zhou, Shuheng , booktitle =. Thresholding Procedures for High Dimensional Variable Selection and Statistical Estimation , url =

work page

[52] [52]

The Annals of Statistics , number =

Nicolai Meinshausen and Bin Yu , title =. The Annals of Statistics , number =. 2009 , _doi =

work page 2009

[53] [53]

2010 , eprint=

Thresholded Lasso for high dimensional variable selection and statistical estimation , author=. 2010 , eprint=

work page 2010

[54] [54]

Bernoulli , number =

Alexandre Belloni and Victor Chernozhukov , title =. Bernoulli , number =. 2013 , _doi =

work page 2013

[55] [55]

1995 , url=

Extending and Benchmarking Cascade-Correlation: Extensions to the Cascade-Correlation Architecture and Benchmarking of Feed-forward Supervised Artificial Neural Networks , author=. 1995 , url=

work page 1995

[56] [56]

Tikhonov, A. N. Solution of incorrectly formulated problems and the regularization method. Doklady Akademii Nauk SSSR. 1963

work page 1963

[57] [57]

Quantitative Sociology , publisher =

11 -. Quantitative Sociology , publisher =. 1975 , _series =

work page 1975

[58] [58]

Journal of Applied Probability , author=

Soft Modelling by Latent Variables: The Non-Linear Iterative Partial Least Squares (. Journal of Applied Probability , author=. 1975 , pages=

work page 1975

[59] [59]

Random decision forests , year=

Tin Kam Ho , booktitle=. Random decision forests , year=

work page

[60] [60]

Friedman , title =

Jerome H. Friedman , title =. The Annals of Statistics , number =. 2001 , _doi =

work page 2001

[61] [61]

1997 , _issn =

A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting , journal =. 1997 , _issn =

work page 1997

[62] [62]

and Guyon, Isabelle M

Boser, Bernhard E. and Guyon, Isabelle M. and Vapnik, Vladimir N. , title =. Proceedings of the Fifth Annual Workshop on Computational Learning Theory , pages =. 1992 , _isbn =

work page 1992

[63] [63]

Survival analysis of heart failure patients: A case study , volume =

Tanvir Ahmad and Assia Munir and Sajjad Haider Bhatti and Muhammad Aftab and Muhammad Ali Raza , doi =. Survival analysis of heart failure patients: A case study , volume =. PLOS ONE , month =

work page

[64] [64]

Machine learning can predict survival of patients with heart failure from serum creatinine and ejection fraction alone , volume =

Davide Chicco and Giuseppe Jurman , doi =. Machine learning can predict survival of patients with heart failure from serum creatinine and ejection fraction alone , volume =. BMC Medical Informatics and Decision Making , month =

work page

[65] [65]

Decision Support Systems62, 22–31 (2014) https://doi.org/ 10.1016/j.dss.2014.03.001 26

A data-driven approach to predict the success of bank telemarketing , journal =. 2014 , _issn =. doi:https://doi.org/10.1016/j.dss.2014.03.001 , url =

work page doi:10.1016/j.dss.2014.03.001 2014

[66] [66]

Cortez, A

Modeling wine preferences by data mining from physicochemical properties , journal =. 2009 , _issn =. doi:https://doi.org/10.1016/j.dss.2009.05.016 , url =

work page doi:10.1016/j.dss.2009.05.016 2009

[67] [67]

German , title =

B. German , title =. 1987 , type =

work page 1987

[68] [68]

1973 , _issn =

The Use of Spark Source Mass Spectrometry for the Analysis of Glass Fragments Encountered in Forensic Applications, Part 2 , journal =. 1973 , _issn =. doi:https://doi.org/10.1016/S0015-7368(73)70826-4 , url =

work page doi:10.1016/s0015-7368(73)70826-4 1973

[69] [69]

1974 , _issn =

A Report on an Investigation into the Trace Elements Present in Vehicle Headlamp and Auxiliary Lamp Glasses , journal =. 1974 , _issn =. doi:https://doi.org/10.1016/S0015-7368(74)70850-7 , url =

work page doi:10.1016/s0015-7368(74)70850-7 1974

[70] [70]

Grace C. Y. Peng and Mark Alber and Adrian Buganza Tepole and William R. Cannon and Suvranu De and Savador Dura-Bernal and Krishna Garikipati and George Karniadakis and William W. Lytton and Paris Perdikaris and Linda Petzold and Ellen Kuhl , url =. Multiscale Modeling Meets Machine Learning: What Can We Learn? , volume =. Archives of Computational Method...

work page

[71] [71]

ACM Comput

Willard, Jared and Jia, Xiaowei and Xu, Shaoming and Steinbach, Michael and Kumar, Vipin , title =. ACM Comput. Surv. , month =. 2022 , issue_date =

work page 2022