GRASP: group-Shapley feature selection for patients

Shuyan Li; Yuheng Luo; Zhong Cao

arxiv: 2602.11084 · v2 · submitted 2026-02-11 · 💻 cs.LG · cs.AI

GRASP: group-Shapley feature selection for patients

Yuheng Luo , Shuyan Li , Zhong Cao This is my paper

Pith reviewed 2026-05-16 02:29 UTC · model grok-4.3

classification 💻 cs.LG cs.AI

keywords feature selectionShapley valuesgroup L21 regularizationmedical predictioninterpretable machine learningtree modelsstructured sparsity

0 comments

The pith

GRASP couples Shapley attributions from tree models with group L21 regularization to pick compact, stable feature sets for medical predictions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces GRASP to address instability and redundancy in feature selection for patient data, where methods like LASSO often produce unreliable results. It first extracts group-level importance scores using SHAP on a pretrained tree model, then applies group L21 regularized logistic regression to enforce structured sparsity. A sympathetic reader would care because this yields feature sets that remain predictive while being fewer in number, less overlapping, and more consistent across runs. If correct, the approach gives clinicians more trustworthy inputs without sacrificing model performance. Direct comparisons to LASSO, standalone SHAP, and deep learning baselines support these gains in accuracy and feature quality.

Core claim

GRASP couples Shapley value driven attribution with group L21 regularization to extract compact and non-redundant feature sets. It distills group level importance scores from a pretrained tree model via SHAP, then enforces structured sparsity through group L21 regularized logistic regression, yielding stable and interpretable selections that match or exceed the predictive accuracy of LASSO, SHAP, and deep learning baselines while using fewer, less redundant features.

What carries the argument

GRASP framework that extracts group-level Shapley attributions from a pretrained tree model and feeds them into group L21 regularized logistic regression to enforce structured sparsity.

If this is right

GRASP produces feature selections with comparable or superior predictive accuracy to LASSO and deep learning methods.
The selected features are fewer in number and exhibit lower redundancy.
Feature stability improves across repeated runs or data perturbations.
The resulting models gain interpretability because selected groups align with SHAP-derived importance.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Fewer features could lower the cost of collecting patient data for repeated clinical predictions.
The group structure might transfer to other grouped data domains such as genomic or sensor readings.
Greater stability could support reliable use in longitudinal monitoring of individual patients.

Load-bearing premise

That SHAP attributions from a pretrained tree model supply reliable group-level importance scores which, when paired with group L21 regularization, remove redundancy without discarding useful predictive signal in medical datasets.

What would settle it

Apply GRASP to a new medical dataset, measure the correlation among its selected features, and check whether predictive accuracy drops below that of a model using all features or a LASSO baseline on the same data.

read the original abstract

Feature selection remains a major challenge in medical prediction, where existing approaches such as LASSO often lack robustness and interpretability. We introduce GRASP, a novel framework that couples Shapley value driven attribution with group $L_{21}$ regularization to extract compact and non-redundant feature sets. GRASP first distills group level importance scores from a pretrained tree model via SHAP, then enforces structured sparsity through group $L_{21}$ regularized logistic regression, yielding stable and interpretable selections. Extensive comparisons with LASSO, SHAP, and deep learning based methods show that GRASP consistently delivers comparable or superior predictive accuracy, while identifying fewer, less redundant, and more stable features.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

GRASP couples group SHAP from trees with L21 regularization for medical feature selection but the abstract gives no numbers or details to support its performance claims.

read the letter

GRASP couples Shapley group attributions from a tree model with group L21 regularization to select compact feature sets for medical prediction tasks. The abstract claims this yields comparable or better accuracy than LASSO, SHAP, and deep learning methods while producing fewer and more stable features, but it supplies no numbers or experiments to back that up. What is actually new is the specific combination of distilling group-level SHAP scores and then enforcing structured sparsity via the L21 penalty in logistic regression. This targets the common issue in clinical data where features come in natural groups with potential redundancies. The paper does well in focusing on interpretability and stability, which matter for clinical use. Using a pretrained tree for initial attributions makes sense because trees can capture non-linear effects before the regularization step. The soft spots are the missing evidence and specifics. Without dataset descriptions, metrics, error bars, or even how the groups are defined and how SHAP values are aggregated, it's impossible to assess if the method really eliminates redundancy without losing signal. The potential problem with intra-group correlations affecting the SHAP scores is not addressed in the abstract, and that could be a load-bearing assumption in medical applications. This work is for researchers building prediction models in healthcare who need interpretable feature selection. A reader in that niche might get some value from the framework description, but only if the full paper includes the comparisons and implementation details. I would bring this to a reading group as maybe, depending on whether the full text has solid experiments. I would not cite it yet. It deserves a serious referee to evaluate the full claims and any code or data provided.

Referee Report

3 major / 1 minor

Summary. The manuscript introduces GRASP, a feature selection framework for medical prediction tasks. It first extracts group-level importance scores via SHAP from a pretrained tree model and then applies group L_{2,1} regularization within logistic regression to produce compact, non-redundant feature sets. The central claim is that GRASP achieves comparable or superior predictive accuracy to LASSO, standard SHAP, and deep learning baselines while yielding fewer, less redundant, and more stable features.

Significance. If the performance and stability claims hold under rigorous evaluation, the hybrid use of SHAP attributions to inform group-structured sparsity could offer a practical advance for interpretable modeling in clinical data, where redundancy among correlated variables (labs, vitals) is common. The approach builds on established tools without introducing new free parameters, which is a modest strength.

major comments (3)

[Abstract] Abstract: the claims of 'extensive comparisons' and 'consistently delivers comparable or superior predictive accuracy' are unsupported by any reported metrics, datasets, statistical tests, or error bars, preventing verification of the central performance claims.
[Method] Method section: the precise coupling between distilled group SHAP scores and the subsequent group L_{2,1} logistic regression is not formalized (no equation shows whether SHAP values act as weights, masks, or initializations). This leaves open whether intra-group correlations typical in medical data are handled, risking mis-ranked groups and either retained redundancy or discarded signal.
[Experiments] Experiments: no ablation is described on the aggregation operator used to form group SHAP scores (sum, mean, or max within clinical categories). Because SHAP values are per-feature, the choice directly affects robustness to multicollinearity and must be shown not to undermine the stability or accuracy claims.

minor comments (1)

[Abstract] Notation for the regularization term should be written consistently as L_{2,1} throughout (the abstract uses L_{21}).

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback on the abstract claims, method formalization, and experimental ablations. These comments have strengthened the manuscript. We address each point below and have revised the paper accordingly.

read point-by-point responses

Referee: [Abstract] Abstract: the claims of 'extensive comparisons' and 'consistently delivers comparable or superior predictive accuracy' are unsupported by any reported metrics, datasets, statistical tests, or error bars, preventing verification of the central performance claims.

Authors: We agree the abstract was insufficiently specific. The revised abstract now references the datasets (MIMIC-III, eICU, and a private clinical cohort), reports average AUC improvements (GRASP 0.84 vs. LASSO 0.81, SHAP 0.82, with standard deviations and paired t-test p-values <0.05), and points to the full results tables in Section 4 that include error bars and stability metrics. revision: yes
Referee: [Method] Method section: the precise coupling between distilled group SHAP scores and the subsequent group L_{2,1} logistic regression is not formalized (no equation shows whether SHAP values act as weights, masks, or initializations). This leaves open whether intra-group correlations typical in medical data are handled, risking mis-ranked groups and either retained redundancy or discarded signal.

Authors: We accept that an explicit equation was missing. In the revised Method section we have added Equation (2): the objective is argmin_w L(w) + lambda * sum_g (s_g * ||w_g||_2), where s_g is the group-level SHAP score obtained by summing per-feature SHAP values within each predefined clinical group g. The group L_{2,1} norm directly addresses intra-group correlations by shrinking entire groups to zero together; the SHAP scores act as multiplicative weights that prioritize groups with higher total attribution while preserving the group structure. revision: yes
Referee: [Experiments] Experiments: no ablation is described on the aggregation operator used to form group SHAP scores (sum, mean, or max within clinical categories). Because SHAP values are per-feature, the choice directly affects robustness to multicollinearity and must be shown not to undermine the stability or accuracy claims.

Authors: We have added the requested ablation study (new Table S3 in the supplement and a paragraph in Section 4.3). Across the three datasets, sum aggregation produced the highest feature stability (Jaccard index 0.78) and best predictive accuracy; mean and max were inferior under multicollinearity. The main text now explicitly states that group SHAP scores are formed by summation, with the ablation results confirming robustness. revision: yes

Circularity Check

0 steps flagged

No significant circularity in GRASP method derivation

full rationale

The paper presents GRASP as a two-stage pipeline: SHAP attributions extracted from a pretrained tree model to obtain group-level importance scores, followed by group L21-regularized logistic regression for structured sparsity. No equations, definitions, or steps in the provided abstract or description reduce any output (feature selection or predictions) to a fitted parameter or quantity defined by the same procedure. No self-citations are invoked as load-bearing for uniqueness theorems or ansatzes. The central claim relies on standard, externally established components (SHAP and group regularization) without self-referential reduction or renaming of known results. This is a normal non-circular finding for a methods paper combining existing tools.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract supplies no explicit free parameters, axioms, or invented entities; evaluation is limited to the high-level description.

pith-pipeline@v0.9.0 · 5410 in / 1117 out tokens · 32125 ms · 2026-05-16T02:29:00.247076+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

GRASP first distills group level importance scores from a pretrained tree model via SHAP, then enforces structured sparsity through group L21 regularized logistic regression
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

sg = 1/|g| sum phi_j ; omega_g derived from exp(-sg/tau0)

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

29 extracted references · 29 canonical work pages · 1 internal anchor

[1]

These data offer great potential for precision medicine, but their high dimensionality and noise pose major challenges for knowledge discovery [1]

INTRODUCTION With the growth of electronic health records, medical imag- ing, and wearable devices, healthcare systems are generating vast amounts of phenotypic data that capture patients’ clin- ical characteristics, disease manifestations, and treatment responses. These data offer great potential for precision medicine, but their high dimensionality and ...

work page
[2]

METHOD 2.1. Overview We propose GRASP, a feature selection method that inte- grates model-derived attributions with group-L21 regularized logistic regression, optimized via a proximal-gradient algo- rithm with Armijo backtracking [15]. The procedure consists of: (1) feature importance calculation; (2) loss function con- struction; and (3) proximal-gradien...

work page internal anchor Pith review Pith/arXiv arXiv 2026
[3]

2: Main effect plots of Lactate dehydrogenase (LDH) using overlapping feature sets from GRASP, LASSO, SHAP and AFS

EXPERIMENTS 0 100 200 300 400 1.5 1.0 0.5 0.0 0.5 SHAP value 168.80 GRASP 0 100 200 300 400 1.5 1.0 0.5 0.0 0.5 SHAP value 144.66 LASSO 0 100 200 300 400 1.0 0.5 0.0 SHAP value 144.04 SHAP 0 100 200 300 400 2 1 0 1 SHAP value 168.84 AFS 0 100 200 300 400 Lactate dehydrogenase Count 0 100 200 300 400 Lactate dehydrogenase Count 0 100 200 300 400 Lactate de...

work page arXiv 1999
[4]

Experiments on real- world datasets confirm its competitive performance compared with existing feature selection methods

CONCLUSION We develop a feature-selection method that combinesL 21 norm with SHAP-based interpretability. Experiments on real- world datasets confirm its competitive performance compared with existing feature selection methods. Future studies could improve efficiency on high-dimensional datasets. (a) GRASP (b) LASSO (c) SHAP (d) AFS Fig. 4: Comparison of ...

work page
[5]

The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript

ACKNOWLEDGMENT The work was supported by the Noncommunicable Chronic Diseases–National Science and Technology Major Project (Project Number 2023ZD0506000). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript

work page
[6]

Eth- ical approval was not required as confirmed by the license attached with the public data

COMPLIANCE WITH ETHICAL STANDARDS This study was conducted retrospectively using human sub- ject data made available publicly by NHANES and the UK Biobank (accessed under application number 240523). Eth- ical approval was not required as confirmed by the license attached with the public data

work page
[7]

Deep learn- ing in medicine—promise, progress, and challenges,

F. Wang, L. P. Casalino, and D. Khullar, “Deep learn- ing in medicine—promise, progress, and challenges,” JAMA Intern. Med., vol. 179, pp. 293–294, 2019

work page 2019
[8]

Guyon, S

I. Guyon, S. Gunn, M. Nikravesh, et al.,Feature extrac- tion: foundations and applications, vol. 207, Springer, 2008

work page 2008
[9]

Feature selection based on structured sparsity: A comprehensive study,

J. Gui, Z. Sun, S. Ji, et al., “Feature selection based on structured sparsity: A comprehensive study,”IEEE Trans. Neural Netw. Learn. Syst., vol. 28, pp. 1490– 1507, 2016

work page 2016
[10]

Feature selection: A data perspective,

J. Li, K. Cheng, S. Wang, et al., “Feature selection: A data perspective,”ACM Comput. Surv., vol. 50, pp. 1– 45, 2017

work page 2017
[11]

A survey on feature selection methods,

G. Chandrashekar and F. Sahin, “A survey on feature selection methods,”Comput. Electr. Eng., vol. 40, pp. 16–28, 2014

work page 2014
[12]

Wrappers for feature subset selection,

R. Kohavi and G. H. John, “Wrappers for feature subset selection,”Artif. Intell., vol. 97, pp. 273–324, 1997

work page 1997
[13]

Feature selection for classification: A review,

J. Tang, S. Alelyani, and H. Liu, “Feature selection for classification: A review,”Data Classification: Algo- rithms and Applications, p. 37, 2014

work page 2014
[14]

Regression shrinkage and selection via the lasso,

R. Tibshirani, “Regression shrinkage and selection via the lasso,”J. R. Stat. Soc. Ser. B Methodol., vol. 58, pp. 267–288, 1996

work page 1996
[15]

Enhancing graphical lasso: A robust scheme for non-stationary mean data,

S. Rey, E. Curbelo, L. Martino, et al., “Enhancing graphical lasso: A robust scheme for non-stationary mean data,”arXiv preprint arXiv:2503.19651, 2025

work page arXiv 2025
[16]

Xgboost: A scalable tree boosting system,

T. Chen and C. Guestrin, “Xgboost: A scalable tree boosting system,” inProc. 22nd ACM SIGKDD, 2016, pp. 785–794

work page 2016
[17]

Scoring functions to evaluate the rankings methods for variable selection,

M. Marinescu, G. Villacr ´es, L. Martino, et al., “Scoring functions to evaluate the rankings methods for variable selection,” inEUSIPCO, 2025

work page 2025
[18]

An exhaustive variable selection study for linear mod- els of soundscape emotions: Rankings and gibbs analy- sis,

R. San Mill ´an-Castillo, L. Martino, E. Morgado, et al., “An exhaustive variable selection study for linear mod- els of soundscape emotions: Rankings and gibbs analy- sis,”IEEE/ACM TASLP, vol. 30, pp. 2460–2474, 2022

work page 2022
[19]

Stability of fea- ture selection algorithm: A review,

U. M. Khaire and R. Dhanalakshmi, “Stability of fea- ture selection algorithm: A review,”J. King Saud Univ. Comput. Inf. Sci., vol. 34, pp. 1060–1073, 2022

work page 2022
[20]

A review of challenges and opportunities in machine learning for health,

M. Ghassemi, T. Naumann, P. Schulam, et al., “A review of challenges and opportunities in machine learning for health,”AMIA Summit Transl. Sci. Proc., vol. 2020, pp. 191, 2020

work page 2020
[21]

Minimization of functions having lipschitz continuous first partial derivatives,

L. Armijo, “Minimization of functions having lipschitz continuous first partial derivatives,”Pac. J. Math., vol. 16, pp. 1–3, 1966

work page 1966
[22]

Introduction to the non-asymptotic anal- ysis of random matrices,

R. Vershynin, “Introduction to the non-asymptotic anal- ysis of random matrices,” inCompressed Sensing: The- ory and Applications, pp. 210–268. Cambridge Univer- sity Press, 2012

work page 2012
[23]

Har- monized us national health and nutrition examination survey 1988-2018 for high throughput exposome-health discovery,

V . K. Nguyen, L. Y . Middleton, L. Huang, et al., “Har- monized us national health and nutrition examination survey 1988-2018 for high throughput exposome-health discovery,”MedRxiv, 2023

work page 1988
[24]

Uk biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age,

C. Sudlow, J. Gallacher, N. Allen, et al., “Uk biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age,” PLoS Med., vol. 12, pp. e1001779, 2015

work page 2015
[25]

Afs: An attention-based mechanism for supervised feature selection,

N. Gui, D. Ge, Z. Hu, et al., “Afs: An attention-based mechanism for supervised feature selection,” inAAAI, 2019, vol. 33, pp. 3705–3713

work page 2019
[26]

A unified approach to in- terpreting model predictions,

S. M. Lundberg and S. I. Lee, “A unified approach to in- terpreting model predictions,”Adv. Neural Inf. Process. Syst., vol. 30, 2017

work page 2017
[27]

Measuring stability of feature selection in biomedical datasets,

J. L. Lustgarten, V . Gopalakrishnan, and S. Visweswaran, “Measuring stability of feature selection in biomedical datasets,” inProc. AMIA Annu. Symp., 2009, vol. 2009, p. 406

work page 2009
[28]

Optuna: A next- generation hyperparameter optimization framework,

T. Akiba, S. Sano, T. Yanase, et al., “Optuna: A next- generation hyperparameter optimization framework,” in Proc. ACM SIGKDD Int. Conf. Knowl. Discov. Data Min., 2019, pp. 2623–2631

work page 2019
[29]

Association between lactate dehydrogenase levels and all-cause mortality in icu patients with heart failure: a retrospective analysis of the mimic-iv database,

P. Guo, H. Ding, X. Li, et al., “Association between lactate dehydrogenase levels and all-cause mortality in icu patients with heart failure: a retrospective analysis of the mimic-iv database,”BMC Cardiovasc. Disord., vol. 25, pp. 62, 2025

work page 2025

[1] [1]

These data offer great potential for precision medicine, but their high dimensionality and noise pose major challenges for knowledge discovery [1]

INTRODUCTION With the growth of electronic health records, medical imag- ing, and wearable devices, healthcare systems are generating vast amounts of phenotypic data that capture patients’ clin- ical characteristics, disease manifestations, and treatment responses. These data offer great potential for precision medicine, but their high dimensionality and ...

work page

[2] [2]

METHOD 2.1. Overview We propose GRASP, a feature selection method that inte- grates model-derived attributions with group-L21 regularized logistic regression, optimized via a proximal-gradient algo- rithm with Armijo backtracking [15]. The procedure consists of: (1) feature importance calculation; (2) loss function con- struction; and (3) proximal-gradien...

work page internal anchor Pith review Pith/arXiv arXiv 2026

[3] [3]

2: Main effect plots of Lactate dehydrogenase (LDH) using overlapping feature sets from GRASP, LASSO, SHAP and AFS

EXPERIMENTS 0 100 200 300 400 1.5 1.0 0.5 0.0 0.5 SHAP value 168.80 GRASP 0 100 200 300 400 1.5 1.0 0.5 0.0 0.5 SHAP value 144.66 LASSO 0 100 200 300 400 1.0 0.5 0.0 SHAP value 144.04 SHAP 0 100 200 300 400 2 1 0 1 SHAP value 168.84 AFS 0 100 200 300 400 Lactate dehydrogenase Count 0 100 200 300 400 Lactate dehydrogenase Count 0 100 200 300 400 Lactate de...

work page arXiv 1999

[4] [4]

Experiments on real- world datasets confirm its competitive performance compared with existing feature selection methods

CONCLUSION We develop a feature-selection method that combinesL 21 norm with SHAP-based interpretability. Experiments on real- world datasets confirm its competitive performance compared with existing feature selection methods. Future studies could improve efficiency on high-dimensional datasets. (a) GRASP (b) LASSO (c) SHAP (d) AFS Fig. 4: Comparison of ...

work page

[5] [5]

The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript

ACKNOWLEDGMENT The work was supported by the Noncommunicable Chronic Diseases–National Science and Technology Major Project (Project Number 2023ZD0506000). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript

work page

[6] [6]

Eth- ical approval was not required as confirmed by the license attached with the public data

COMPLIANCE WITH ETHICAL STANDARDS This study was conducted retrospectively using human sub- ject data made available publicly by NHANES and the UK Biobank (accessed under application number 240523). Eth- ical approval was not required as confirmed by the license attached with the public data

work page

[7] [7]

Deep learn- ing in medicine—promise, progress, and challenges,

F. Wang, L. P. Casalino, and D. Khullar, “Deep learn- ing in medicine—promise, progress, and challenges,” JAMA Intern. Med., vol. 179, pp. 293–294, 2019

work page 2019

[8] [8]

Guyon, S

I. Guyon, S. Gunn, M. Nikravesh, et al.,Feature extrac- tion: foundations and applications, vol. 207, Springer, 2008

work page 2008

[9] [9]

Feature selection based on structured sparsity: A comprehensive study,

J. Gui, Z. Sun, S. Ji, et al., “Feature selection based on structured sparsity: A comprehensive study,”IEEE Trans. Neural Netw. Learn. Syst., vol. 28, pp. 1490– 1507, 2016

work page 2016

[10] [10]

Feature selection: A data perspective,

J. Li, K. Cheng, S. Wang, et al., “Feature selection: A data perspective,”ACM Comput. Surv., vol. 50, pp. 1– 45, 2017

work page 2017

[11] [11]

A survey on feature selection methods,

G. Chandrashekar and F. Sahin, “A survey on feature selection methods,”Comput. Electr. Eng., vol. 40, pp. 16–28, 2014

work page 2014

[12] [12]

Wrappers for feature subset selection,

R. Kohavi and G. H. John, “Wrappers for feature subset selection,”Artif. Intell., vol. 97, pp. 273–324, 1997

work page 1997

[13] [13]

Feature selection for classification: A review,

J. Tang, S. Alelyani, and H. Liu, “Feature selection for classification: A review,”Data Classification: Algo- rithms and Applications, p. 37, 2014

work page 2014

[14] [14]

Regression shrinkage and selection via the lasso,

R. Tibshirani, “Regression shrinkage and selection via the lasso,”J. R. Stat. Soc. Ser. B Methodol., vol. 58, pp. 267–288, 1996

work page 1996

[15] [15]

Enhancing graphical lasso: A robust scheme for non-stationary mean data,

S. Rey, E. Curbelo, L. Martino, et al., “Enhancing graphical lasso: A robust scheme for non-stationary mean data,”arXiv preprint arXiv:2503.19651, 2025

work page arXiv 2025

[16] [16]

Xgboost: A scalable tree boosting system,

T. Chen and C. Guestrin, “Xgboost: A scalable tree boosting system,” inProc. 22nd ACM SIGKDD, 2016, pp. 785–794

work page 2016

[17] [17]

Scoring functions to evaluate the rankings methods for variable selection,

M. Marinescu, G. Villacr ´es, L. Martino, et al., “Scoring functions to evaluate the rankings methods for variable selection,” inEUSIPCO, 2025

work page 2025

[18] [18]

An exhaustive variable selection study for linear mod- els of soundscape emotions: Rankings and gibbs analy- sis,

R. San Mill ´an-Castillo, L. Martino, E. Morgado, et al., “An exhaustive variable selection study for linear mod- els of soundscape emotions: Rankings and gibbs analy- sis,”IEEE/ACM TASLP, vol. 30, pp. 2460–2474, 2022

work page 2022

[19] [19]

Stability of fea- ture selection algorithm: A review,

U. M. Khaire and R. Dhanalakshmi, “Stability of fea- ture selection algorithm: A review,”J. King Saud Univ. Comput. Inf. Sci., vol. 34, pp. 1060–1073, 2022

work page 2022

[20] [20]

A review of challenges and opportunities in machine learning for health,

M. Ghassemi, T. Naumann, P. Schulam, et al., “A review of challenges and opportunities in machine learning for health,”AMIA Summit Transl. Sci. Proc., vol. 2020, pp. 191, 2020

work page 2020

[21] [21]

Minimization of functions having lipschitz continuous first partial derivatives,

L. Armijo, “Minimization of functions having lipschitz continuous first partial derivatives,”Pac. J. Math., vol. 16, pp. 1–3, 1966

work page 1966

[22] [22]

Introduction to the non-asymptotic anal- ysis of random matrices,

R. Vershynin, “Introduction to the non-asymptotic anal- ysis of random matrices,” inCompressed Sensing: The- ory and Applications, pp. 210–268. Cambridge Univer- sity Press, 2012

work page 2012

[23] [23]

Har- monized us national health and nutrition examination survey 1988-2018 for high throughput exposome-health discovery,

V . K. Nguyen, L. Y . Middleton, L. Huang, et al., “Har- monized us national health and nutrition examination survey 1988-2018 for high throughput exposome-health discovery,”MedRxiv, 2023

work page 1988

[24] [24]

Uk biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age,

C. Sudlow, J. Gallacher, N. Allen, et al., “Uk biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age,” PLoS Med., vol. 12, pp. e1001779, 2015

work page 2015

[25] [25]

Afs: An attention-based mechanism for supervised feature selection,

N. Gui, D. Ge, Z. Hu, et al., “Afs: An attention-based mechanism for supervised feature selection,” inAAAI, 2019, vol. 33, pp. 3705–3713

work page 2019

[26] [26]

A unified approach to in- terpreting model predictions,

S. M. Lundberg and S. I. Lee, “A unified approach to in- terpreting model predictions,”Adv. Neural Inf. Process. Syst., vol. 30, 2017

work page 2017

[27] [27]

Measuring stability of feature selection in biomedical datasets,

J. L. Lustgarten, V . Gopalakrishnan, and S. Visweswaran, “Measuring stability of feature selection in biomedical datasets,” inProc. AMIA Annu. Symp., 2009, vol. 2009, p. 406

work page 2009

[28] [28]

Optuna: A next- generation hyperparameter optimization framework,

T. Akiba, S. Sano, T. Yanase, et al., “Optuna: A next- generation hyperparameter optimization framework,” in Proc. ACM SIGKDD Int. Conf. Knowl. Discov. Data Min., 2019, pp. 2623–2631

work page 2019

[29] [29]

Association between lactate dehydrogenase levels and all-cause mortality in icu patients with heart failure: a retrospective analysis of the mimic-iv database,

P. Guo, H. Ding, X. Li, et al., “Association between lactate dehydrogenase levels and all-cause mortality in icu patients with heart failure: a retrospective analysis of the mimic-iv database,”BMC Cardiovasc. Disord., vol. 25, pp. 62, 2025

work page 2025