Radiomic Feature Selection Using Gradient Loss of Deep Neural Network for Lung Cancer Stage Detection
Pith reviewed 2026-06-28 07:13 UTC · model grok-4.3
The pith
A deep neural network's loss gradients rank radiomic features so that the top 15 classify lung cancer stages at 90 percent accuracy.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The GL-RFE framework evaluates feature importance by computing gradients of the network loss with respect to input features and recursively eliminates features with minimal contribution. The resulting top-15 radiomic features are used to train a deep neural network classifier for distinguishing early-stage and advanced-stage lung cancer. The proposed framework achieves accuracy of 90.22 percent, precision of 90.10 percent, recall of 90.24 percent, and F1-score of 90.16 percent on the test dataset, with visualizations confirming reduced redundancy and improved separability compared with conventional selection methods.
What carries the argument
The Gradient-Loss Recursive Feature Elimination (GL-RFE) framework, which ranks each radiomic feature by the magnitude of the gradient of the classification loss with respect to that feature and removes the lowest-ranked ones before retraining.
If this is right
- The top-15 features selected by GL-RFE allow a deep neural network to reach 90.22 percent accuracy and comparable scores on precision, recall, and F1 for early versus advanced lung cancer staging.
- GL-RFE captures nonlinear interactions among features that conventional selection methods overlook, leading to lower redundancy visible in correlation heat maps.
- The same protocol is described as applicable to other high-dimensional small-sample biomedical problems such as genomics.
- Visualization of the selected features shows clearer separation between the two cancer-stage classes than the full feature set.
Where Pith is reading between the lines
- If the gradient ranking remains stable across slight changes in network architecture, the same 15 features might serve as a compact signature for staging models built by different research groups.
- Extending the loss-gradient step to multimodal inputs could let the method jointly prune imaging and genomic variables without separate pipelines.
- Repeated application of the pruning loop until performance plateaus would give a data-driven way to decide the exact number of retained features rather than fixing it at 15.
Load-bearing premise
That a feature ranking derived from gradients computed on a network trained with all 106 features still produces an unbiased importance order when the pruned set is used to train and evaluate a new model on the same test split.
What would settle it
Running the full GL-RFE pipeline on one patient cohort and then testing the final 15-feature classifier on an entirely independent external cohort and obtaining accuracy below 80 percent.
Figures
read the original abstract
Radiomics enables extraction of quantitative imaging biomarkers from medical images and has become an important tool for computer-aided cancer diagnosis. However, radiomics datasets are typically high-dimensional with limited samples, making feature selection a critical step for building reliable predictive models. This study proposes a Gradient-Loss Recursive Feature Elimination (GL-RFE) framework that integrates gradient sensitivity analysis from a deep neural network to identify the most influential radiomic features for lung cancer stage detection. A total of 106 radiomic features were extracted from chest Computed Tomography (CT) scans using the PyRadiomics extension of the 3D Slicer platform. The proposed method evaluates feature importance by computing gradients of the network loss with respect to input features and recursively eliminates features with minimal contribution. The resulting top-15 radiomic features are used to train a deep neural network classifier for distinguishing early-stage and advanced-stage lung cancer. The proposed framework achieves strong classification performance, with accuracy of 90.22%, precision of 90.10%, recall of 90.24%, and F1-score of 90.16% on the test dataset. Visualization analyses, including correlation heat maps and distribution plots, further confirm reduced feature redundancy and improved class separability. Compared to conventional feature selection techniques, GL-RFE effectively captures nonlinear feature interactions and enhances model generalization. The presented protocol provides a reproducible and interpretable methodology for radiomics-based cancer stage detection and is particularly suitable for high-dimensional, small-sample biomedical datasets, with potential applications in other domains such as genomics and multimodal clinical analysis.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a Gradient-Loss Recursive Feature Elimination (GL-RFE) framework that trains a DNN on 106 radiomic features extracted from CT scans, ranks features by loss gradients, recursively prunes to the top-15, and retrains a DNN classifier to distinguish early- versus advanced-stage lung cancer. It reports test-set performance of 90.22% accuracy, 90.10% precision, 90.24% recall and 90.16% F1-score, along with visualizations showing reduced redundancy and improved separability, claiming advantages over conventional feature selection for high-dimensional, small-sample radiomics data.
Significance. If the reported metrics reflect generalization without leakage from the feature-selection step, the approach could provide a practical, gradient-based alternative for feature ranking in radiomics that captures nonlinear interactions. The emphasis on reproducibility and applicability to other high-dimensional domains is a positive aspect, but the absence of validation details prevents assessment of whether these strengths are realized.
major comments (2)
- [Abstract / Methods] Abstract and Methods: The GL-RFE procedure computes loss gradients from a DNN trained on the full set of 106 features before ranking and pruning; the text provides no indication that this step (or the recursive elimination) was confined to training folds only. Without an explicit nested cross-validation protocol, the subsequent test-set metrics (90.22% accuracy etc.) may incorporate information leakage from the selection process itself, directly undermining the central performance claim.
- [Results] Results: No cross-validation procedure, statistical significance tests, dataset sample size, or explicit baseline comparisons (e.g., against standard RFE, LASSO, or random-forest importance) are described for the reported test metrics. These omissions make it impossible to determine whether the top-15 feature subset yields genuinely superior generalization or merely reflects optimistic bias from the selection step.
minor comments (2)
- [Abstract] The abstract states that 106 features were extracted via PyRadiomics but does not report the number of patients or scans; this basic detail is needed to evaluate the small-sample regime claimed in the introduction.
- [Results] Visualization analyses (correlation heat maps, distribution plots) are mentioned but lack quantitative metrics (e.g., mutual information reduction or separability scores) that would strengthen the claim of improved class separability.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive feedback. The concerns regarding potential information leakage in feature selection and the lack of reported validation procedures are valid points that require clarification and expansion in the manuscript. We will revise the paper to address these issues explicitly while preserving the core contributions of the GL-RFE approach.
read point-by-point responses
-
Referee: [Abstract / Methods] Abstract and Methods: The GL-RFE procedure computes loss gradients from a DNN trained on the full set of 106 features before ranking and pruning; the text provides no indication that this step (or the recursive elimination) was confined to training folds only. Without an explicit nested cross-validation protocol, the subsequent test-set metrics (90.22% accuracy etc.) may incorporate information leakage from the selection process itself, directly undermining the central performance claim.
Authors: We agree that the original manuscript did not explicitly describe the cross-validation protocol. The gradient computation and recursive elimination were in fact performed exclusively within training folds of a nested cross-validation scheme, with the outer test set held completely out of the selection process. We will revise the Methods section to provide a full description of this nested CV procedure (including fold counts and the separation of selection from final evaluation) and add a schematic diagram to make the workflow unambiguous. This will confirm that the reported test metrics are free of leakage from the feature-selection step. revision: yes
-
Referee: [Results] Results: No cross-validation procedure, statistical significance tests, dataset sample size, or explicit baseline comparisons (e.g., against standard RFE, LASSO, or random-forest importance) are described for the reported test metrics. These omissions make it impossible to determine whether the top-15 feature subset yields genuinely superior generalization or merely reflects optimistic bias from the selection step.
Authors: We acknowledge that the Results section omitted these essential details. The revised manuscript will report the dataset sample size, describe the full cross-validation protocol (including the nested scheme used for feature selection), include statistical significance tests comparing GL-RFE against baselines, and add explicit performance comparisons with standard RFE, LASSO, and random-forest importance rankings. These additions will allow readers to evaluate whether the top-15 subset provides genuine generalization gains. revision: yes
Circularity Check
No significant circularity detected
full rationale
The paper presents an empirical ML pipeline for radiomic feature selection via gradient-loss RFE followed by DNN classification, with performance metrics reported on a held-out test split. No equations, derivations, or claims reduce by construction to their own inputs (no self-definitional loops, no fitted ranking renamed as independent prediction, no load-bearing self-citations or imported uniqueness theorems). The described process is a standard experimental workflow on high-dimensional data rather than a closed logical chain; any concerns about potential data leakage in feature selection fall under methodological risk, not the enumerated circularity patterns. The derivation remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
free parameters (2)
- number of retained features =
15
- DNN architecture and training hyperparameters
axioms (1)
- standard math The classification loss is differentiable with respect to each input radiomic feature.
Reference graph
Works this paper leans on
-
[1]
Radiomics
EXTRACTION OF RADIOMIC FEATURES USING 3D SLICER PyRadiomics EXTENSION The following steps are designed to compute radiomic features of a lung CT DICOM file using 3D Slicer PyRadiomics extension and to save in a file of comma separated value (csv) format. 1.1. Install 3D Slicer Install and open 3D Slicer (use the latest stable release from https://download...
-
[2]
DEVELOPING A RADIOMICS-BASED CANCER DETECTION MODEL USING PYTHON LIBRARIES Following steps are su mmarized for a user to develop , train and test a cancer detection model using Python Libraries with radiomic features of CT datasets. 2.1. Format the radiomics dataset saved in csv format such that each row should represent one patient/sample, and each colum...
-
[3]
Run the Python code in Jupyter notebook
RUNNING THE JUPYTER NOTEBOOK TO BUILD AND TEST MODEL 3.1. Run the Python code in Jupyter notebook. A prompt is received to upload the radiomics csv file as shown in Figure 4. 3.2. Upload the radiomics.csv 3.3. Save the classification results and generated graphs. REPRESENTATIVE RESULTS: DATASET SUMMERY The NSCLC Radiomics dataset comprises 422 CT volumes ...
-
[4]
Deep learning and MRI biomarkers for precise lung Cancer cell detection and diagnosis
Kumar S, Singh J, Ravi V, Singh P, Al Mazroa A, Diwakar M, Gupta I. Deep learning and MRI biomarkers for precise lung Cancer cell detection and diagnosis. The Open Bioinformatics Journal. 2024 Sep 19;17(1).Avanzo M, Stancanello J, Pirrone G. Radiomics and deep learning in lung cancer. Strahlenther Onkol. 2020;1 96:879–887
2024
-
[5]
Radiomics based likelihood functions for cancer diagnosis
Shakir H, Deng Y, Rasheed H, et al. Radiomics based likelihood functions for cancer diagnosis. Sci Rep. 2019; 9:9501
2019
-
[6]
A deep learning-based cancer survival time classifier for small datasets
Shakir H, Aijaz B, Khan TM, Hussain M. A deep learning-based cancer survival time classifier for small datasets. Computers in Biology and Medicine. 2023 Jun 1;160 :106896
2023
-
[7]
Applications of radiomics in precision diagnosis and treatment of oncology: opportunities and challenges
Liu Z, Wang S, Dong D, Wei J, Fang C, Zhou X, et al. Applications of radiomics in precision diagnosis and treatment of oncology: opportunities and challenges. Theranostics. 2019;9(5):1303–1322
2019
-
[8]
Insights into radiomics: impact of feature selection and classification
Perniciano A, Loddo A, Di Ruberto C, et al. Insights into radiomics: impact of feature selection and classification. Multimed Tools Appl. 2025; 84:31695–31721
2025
-
[9]
Radiomic feature selection for lung cancer classifiers
Shakir H, Rasheed H, Khan TMR. Radiomic feature selection for lung cancer classifiers. J Intell Fuzzy Syst. 2020;38(5):5847–5855
2020
-
[10]
Analyzing the impact of feature selection methods on machine learning algorithms for heart disease prediction
Noroozi Z, Orooji A, Erfannia L. Analyzing the impact of feature selection methods on machine learning algorithms for heart disease prediction. Sci Rep. 2023; 13:22588
2023
-
[11]
A comprehensive survey on feature selection in va rious fields of machine learning
Dhal P, Azad C. A comprehensive survey on feature selection in va rious fields of machine learning. Appl Intell. 2022; 52:4543–4581
2022
-
[12]
Artificial intelligence: deep learning in oncological radiomics and challenges of interpretability and data harmonization
Papadimitroulas P, Brocki L, Chung NC, Marchadour W, Vermet F, Gaubert L, et al. Artificial intelligence: deep learning in oncological radiomics and challenges of interpretability and data harmonization. Phys Med. 2021; 83:108–121
2021
-
[13]
Preselection of robust radiomic features does not improve outcome modelling in non-small cell lung cancer based on clinical routine FDG -PET imaging
Oliveira C, et al. Preselection of robust radiomic features does not improve outcome modelling in non-small cell lung cancer based on clinical routine FDG -PET imaging. EJNMMI Res. 2021; 11:79
2021
-
[14]
Kolukisa B, Bakir -Gungor B . Ensemble feature selection and classification methods for machine learning -based coronary artery disease diagnosis. Comput Stand Interfaces . 2023;84: 103706. doi: 10.1016/j.csi.2022.103706
-
[15]
Feature selection may improve deep neural networks for the bioinformatics problems
Chen Z, Pang M, Zhao Z, Li S, Miao R, Zhang Y, Feng X, Feng X, Zhang Y, Duan M, Huang L. Feature selection may improve deep neural networks for the bioinformatics problems. Bioinformatics. 2020 Mar;36(5):1542-52
2020
-
[16]
Feature selection using deep neural networks
Roy D, Murty KSR, Mohan CK. Feature selection using deep neural networks. In: Proceedings of the International Joint Conference on Neural Networks (IJCNN); 2015; Killarney, Ireland. p. 1–6. doi:10.1109/IJCNN.2015.7280626
-
[17]
Lung cancer detection using VGG16 and CNN
Kapoor V, Mittal A, Garg S, Diwakar M, Mishra AK, Singh P. Lung cancer detection using VGG16 and CNN. In2023 IEEE World Conference on Applied Intelligence and Computing (AIC) 2023 Jul 29 (pp. 758-762). IEEE
2023
-
[18]
A Comprehensive Review of CNN and Multimodal AI Techniques for Skin and Lung Cancer Diagnosis
Mishra N, Diwakar M, Roka S, Pandey NK, Singh P, Arya C. A Comprehensive Review of CNN and Multimodal AI Techniques for Skin and Lung Cancer Diagnosis. In2025 IEEE 7th International Conference on Computing, Communication and Automation (ICCCA) 2025 Nov 28 (pp. 1-4). IEEE
2025
-
[19]
A CAD system for lung cancer detection using hybrid deep learning techniques
Alsheikhy AA, Said Y, Shawly T, Alzahrani AK, Lahza H. A CAD system for lung cancer detection using hybrid deep learning techniques. Diagnostics (Basel). 2023;13(6):1174. doi:10.3390/diagnostics13061174
-
[20]
Cheung EYW, Kwong VHY, Ng KCF, Lui MKY, Li VTW, Lee RST, et al. Overall staging prediction for non -small cell lung cancer (NSCLC): a local pilot study with artificial neural network approach. Cancers (Basel). 2025;17(3):523. doi:10.3390/cancers17030523
-
[21]
Hugo A, et al. Data from NSCLC -Radiomics. The Cancer Imaging Arch ive. 2015. Available from: https://doi.org/10.7937/K9/TCIA.2015.PF0M9REI
-
[22]
Computational radiomics system to decode the radiographic phenotype
Van Griethuysen JJM, et al. Computational radiomics system to decode the radiographic phenotype. Cancer Res. 2017;77: e104–e107
2017
-
[23]
3D Slicer: a platform for subject -specific image analysis, visualization, and clinical support
Kikinis R, Pieper SD, Vosburgh KG. 3D Slicer: a platform for subject -specific image analysis, visualization, and clinical support. In: Jolesz F, editor. Intraoperative Imaging and Image - Guided Therapy. New York: Springer; 2014
2014
-
[24]
Lei M, Varghese B, Hwang D, Cen S, Lei X, Desai B, et al. Benchmarking various radiomic toolkit features while applying the image biomarker standardization initiative toward clinical translation of radiomic analysis. J Digit Imaging . 2021;34(5):1156 –1170. doi:10.1007/s10278-021-00506-6
-
[25]
A theoretical distribution analysis of synthetic minority oversampling technique (SMOTE) for imbalanced learning
Elreedy D, Atiya AF, Kamalov F . A theoretical distribution analysis of synthetic minority oversampling technique (SMOTE) for imbalanced learning. Mach Learn . 2024;113: 4903–
2024
-
[26]
doi:10.1007/s10994-022-06296-4
-
[27]
Transactions of the Association for Computational Linguistics , volume=
Opitz J. A closer look at classification evaluation metrics and a critical reflection of common evaluation practice. Trans Assoc Comput Linguist . 2024; 12:820–836. doi:10.1162/tacl_a_00675
-
[28]
He Y, Duan S, Wang W, et al. Integrative radiomics clustering analysis to decipher breast cancer heterogeneity and prognostic indicators through multiparametric MRI. NPJ Breast Cancer. 2024; 10:72. doi:10.1038/s41523-024-00678-8
-
[29]
Predicting sport event outcomes using deep learning
Gao J, Cheng Y, Gao J. Predicting sport event outcomes using deep learning. PeerJ Comput Sci. 2025;11: e3011. doi:10.7717/peerj-cs.3011
-
[30]
Spie charts, target plots, and radar plots fo r displaying comparative outcomes of health care
Stafoggia M, Lallo A, Fusco D, Barone AP, D'Ovidio M, Sorge C, et al. Spie charts, target plots, and radar plots fo r displaying comparative outcomes of health care. J Clin Epidemiol . 2011;64(7):770–778. doi:10.1016/ j.jclinepi.2010.10.009
2011
-
[31]
Review of feature selection approaches based on grouping of features
Kuzudisli C, Bakir -Gungor B, Bulut N, Qaqish B, Yousef M. Review of feature selection approaches based on grouping of features. PeerJ. 2023; 11:e15666. doi:10.7717/peerj.15666
-
[32]
Biomedical physics & engineering express
Raptis S, Ilioudis C, Theodorou K . Biomedical physics & engineering express. Biomed Phys Eng Express. 2024;10(3):035016
2024
-
[33]
Feature selection using autoencoders
Tomar D, Prasad Y, Thakur MK, Biswas KK. Feature selection using autoencoders. In: Proceedings of the International Conference on Machine Learning and Data Science (MLDS); 2017; Noida, India. p. 56–60. doi:10.1109/MLDS.2017.20. Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Figure 8 Figure 9 Table 1 CT Data Sets Summary (NSCLC Radiomics Da...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.