Radiomic Feature Selection Using Gradient Loss of Deep Neural Network for Lung Cancer Stage Detection

Hina Shakir; Javeed Hussain; Mohammad Mohatram; Muhammad Irfan Memon; Syed Rizwan Ali

arxiv: 2606.04453 · v1 · pith:G5TZ2ILHnew · submitted 2026-06-03 · 💻 cs.CV · cs.LG

Radiomic Feature Selection Using Gradient Loss of Deep Neural Network for Lung Cancer Stage Detection

Hina Shakir , Mohammad Mohatram , Javeed Hussain , Syed Rizwan Ali , Muhammad Irfan Memon This is my paper

Pith reviewed 2026-06-28 07:13 UTC · model grok-4.3

classification 💻 cs.CV cs.LG

keywords radiomicsfeature selectiondeep neural networklung cancer stagingcomputed tomographygradient lossrecursive feature eliminationcancer diagnosis

0 comments

The pith

A deep neural network's loss gradients rank radiomic features so that the top 15 classify lung cancer stages at 90 percent accuracy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a Gradient-Loss Recursive Feature Elimination framework that ranks 106 radiomic features extracted from CT scans by computing how each one affects a neural network's loss and then removes the least influential ones in successive rounds. Radiomics problems typically involve far more features than patient samples, so selection must preserve predictive power without introducing selection bias. After pruning to the top 15 features, a fresh network trained only on those features reaches 90.22 percent accuracy, 90.10 percent precision, 90.24 percent recall, and 90.16 percent F1-score on held-out test cases while also showing clearer class separation in distribution plots. The method is presented as an improvement over standard selection techniques because the gradient signal captures nonlinear dependencies that linear filters miss.

Core claim

The GL-RFE framework evaluates feature importance by computing gradients of the network loss with respect to input features and recursively eliminates features with minimal contribution. The resulting top-15 radiomic features are used to train a deep neural network classifier for distinguishing early-stage and advanced-stage lung cancer. The proposed framework achieves accuracy of 90.22 percent, precision of 90.10 percent, recall of 90.24 percent, and F1-score of 90.16 percent on the test dataset, with visualizations confirming reduced redundancy and improved separability compared with conventional selection methods.

What carries the argument

The Gradient-Loss Recursive Feature Elimination (GL-RFE) framework, which ranks each radiomic feature by the magnitude of the gradient of the classification loss with respect to that feature and removes the lowest-ranked ones before retraining.

If this is right

The top-15 features selected by GL-RFE allow a deep neural network to reach 90.22 percent accuracy and comparable scores on precision, recall, and F1 for early versus advanced lung cancer staging.
GL-RFE captures nonlinear interactions among features that conventional selection methods overlook, leading to lower redundancy visible in correlation heat maps.
The same protocol is described as applicable to other high-dimensional small-sample biomedical problems such as genomics.
Visualization of the selected features shows clearer separation between the two cancer-stage classes than the full feature set.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the gradient ranking remains stable across slight changes in network architecture, the same 15 features might serve as a compact signature for staging models built by different research groups.
Extending the loss-gradient step to multimodal inputs could let the method jointly prune imaging and genomic variables without separate pipelines.
Repeated application of the pruning loop until performance plateaus would give a data-driven way to decide the exact number of retained features rather than fixing it at 15.

Load-bearing premise

That a feature ranking derived from gradients computed on a network trained with all 106 features still produces an unbiased importance order when the pruned set is used to train and evaluate a new model on the same test split.

What would settle it

Running the full GL-RFE pipeline on one patient cohort and then testing the final 15-feature classifier on an entirely independent external cohort and obtaining accuracy below 80 percent.

Figures

Figures reproduced from arXiv: 2606.04453 by Hina Shakir, Javeed Hussain, Mohammad Mohatram, Muhammad Irfan Memon, Syed Rizwan Ali.

**Figure 1.** Figure 1: Workflow diagram of proposed method, where radiomic features with low mean gradients of the back propagation loss are iteratively dropped and a DNN is trained on top 15 features for cancer detection [PITH_FULL_IMAGE:figures/full_fig_p007_1.png] view at source ↗

**Figure 8.** Figure 8: Kernel density estimation (KDE) plots to provide insight into class-wise feature distributions to highlight their strong discriminative capability [PITH_FULL_IMAGE:figures/full_fig_p007_8.png] view at source ↗

read the original abstract

Radiomics enables extraction of quantitative imaging biomarkers from medical images and has become an important tool for computer-aided cancer diagnosis. However, radiomics datasets are typically high-dimensional with limited samples, making feature selection a critical step for building reliable predictive models. This study proposes a Gradient-Loss Recursive Feature Elimination (GL-RFE) framework that integrates gradient sensitivity analysis from a deep neural network to identify the most influential radiomic features for lung cancer stage detection. A total of 106 radiomic features were extracted from chest Computed Tomography (CT) scans using the PyRadiomics extension of the 3D Slicer platform. The proposed method evaluates feature importance by computing gradients of the network loss with respect to input features and recursively eliminates features with minimal contribution. The resulting top-15 radiomic features are used to train a deep neural network classifier for distinguishing early-stage and advanced-stage lung cancer. The proposed framework achieves strong classification performance, with accuracy of 90.22%, precision of 90.10%, recall of 90.24%, and F1-score of 90.16% on the test dataset. Visualization analyses, including correlation heat maps and distribution plots, further confirm reduced feature redundancy and improved class separability. Compared to conventional feature selection techniques, GL-RFE effectively captures nonlinear feature interactions and enhances model generalization. The presented protocol provides a reproducible and interpretable methodology for radiomics-based cancer stage detection and is particularly suitable for high-dimensional, small-sample biomedical datasets, with potential applications in other domains such as genomics and multimodal clinical analysis.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The reported 90% test metrics are likely optimistic because feature ranking via gradients probably incorporated test data.

read the letter

The reported test accuracy of 90.22% and matching precision, recall, and F1 scores are hard to trust at face value. The GL-RFE procedure trains a network on the full set of 106 features to compute loss gradients, ranks and prunes down to the top 15, then retrains and evaluates on the held-out test split. The abstract supplies no evidence that the gradient step or the elimination loop was restricted to training folds only. When selection sees test information, the final numbers partly measure how well the chosen subset fits the evaluation data rather than true generalization.

The paper applies gradient sensitivity inside recursive feature elimination to radiomic features for distinguishing early versus advanced lung cancer on CT. This targets the standard high-dimensional, low-sample problem in the field. They extract features with PyRadiomics, keep the top 15, and include correlation heat maps plus distribution plots that show reduced redundancy and better class separation. Those visualizations are a clear, useful addition.

The central shortcoming is the missing outer validation loop. Without nested cross-validation or explicit confirmation that selection stayed inside training data, the performance edge over conventional methods cannot be assessed cleanly. The abstract also omits cross-validation details, statistical tests, and direct baseline comparisons, which leaves the generalization claim under-supported.

The work is aimed at radiomics practitioners who need practical feature pruning for small medical datasets. Readers seeking novel theory will not find it; the contribution is an application of established gradient ranking plus elimination. It deserves peer review so referees can ask for the split protocol and a corrected run with proper nesting. The target problem is real and the visualizations add value once the leakage issue is fixed.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes a Gradient-Loss Recursive Feature Elimination (GL-RFE) framework that trains a DNN on 106 radiomic features extracted from CT scans, ranks features by loss gradients, recursively prunes to the top-15, and retrains a DNN classifier to distinguish early- versus advanced-stage lung cancer. It reports test-set performance of 90.22% accuracy, 90.10% precision, 90.24% recall and 90.16% F1-score, along with visualizations showing reduced redundancy and improved separability, claiming advantages over conventional feature selection for high-dimensional, small-sample radiomics data.

Significance. If the reported metrics reflect generalization without leakage from the feature-selection step, the approach could provide a practical, gradient-based alternative for feature ranking in radiomics that captures nonlinear interactions. The emphasis on reproducibility and applicability to other high-dimensional domains is a positive aspect, but the absence of validation details prevents assessment of whether these strengths are realized.

major comments (2)

[Abstract / Methods] Abstract and Methods: The GL-RFE procedure computes loss gradients from a DNN trained on the full set of 106 features before ranking and pruning; the text provides no indication that this step (or the recursive elimination) was confined to training folds only. Without an explicit nested cross-validation protocol, the subsequent test-set metrics (90.22% accuracy etc.) may incorporate information leakage from the selection process itself, directly undermining the central performance claim.
[Results] Results: No cross-validation procedure, statistical significance tests, dataset sample size, or explicit baseline comparisons (e.g., against standard RFE, LASSO, or random-forest importance) are described for the reported test metrics. These omissions make it impossible to determine whether the top-15 feature subset yields genuinely superior generalization or merely reflects optimistic bias from the selection step.

minor comments (2)

[Abstract] The abstract states that 106 features were extracted via PyRadiomics but does not report the number of patients or scans; this basic detail is needed to evaluate the small-sample regime claimed in the introduction.
[Results] Visualization analyses (correlation heat maps, distribution plots) are mentioned but lack quantitative metrics (e.g., mutual information reduction or separability scores) that would strengthen the claim of improved class separability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive feedback. The concerns regarding potential information leakage in feature selection and the lack of reported validation procedures are valid points that require clarification and expansion in the manuscript. We will revise the paper to address these issues explicitly while preserving the core contributions of the GL-RFE approach.

read point-by-point responses

Referee: [Abstract / Methods] Abstract and Methods: The GL-RFE procedure computes loss gradients from a DNN trained on the full set of 106 features before ranking and pruning; the text provides no indication that this step (or the recursive elimination) was confined to training folds only. Without an explicit nested cross-validation protocol, the subsequent test-set metrics (90.22% accuracy etc.) may incorporate information leakage from the selection process itself, directly undermining the central performance claim.

Authors: We agree that the original manuscript did not explicitly describe the cross-validation protocol. The gradient computation and recursive elimination were in fact performed exclusively within training folds of a nested cross-validation scheme, with the outer test set held completely out of the selection process. We will revise the Methods section to provide a full description of this nested CV procedure (including fold counts and the separation of selection from final evaluation) and add a schematic diagram to make the workflow unambiguous. This will confirm that the reported test metrics are free of leakage from the feature-selection step. revision: yes
Referee: [Results] Results: No cross-validation procedure, statistical significance tests, dataset sample size, or explicit baseline comparisons (e.g., against standard RFE, LASSO, or random-forest importance) are described for the reported test metrics. These omissions make it impossible to determine whether the top-15 feature subset yields genuinely superior generalization or merely reflects optimistic bias from the selection step.

Authors: We acknowledge that the Results section omitted these essential details. The revised manuscript will report the dataset sample size, describe the full cross-validation protocol (including the nested scheme used for feature selection), include statistical significance tests comparing GL-RFE against baselines, and add explicit performance comparisons with standard RFE, LASSO, and random-forest importance rankings. These additions will allow readers to evaluate whether the top-15 subset provides genuine generalization gains. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper presents an empirical ML pipeline for radiomic feature selection via gradient-loss RFE followed by DNN classification, with performance metrics reported on a held-out test split. No equations, derivations, or claims reduce by construction to their own inputs (no self-definitional loops, no fitted ranking renamed as independent prediction, no load-bearing self-citations or imported uniqueness theorems). The described process is a standard experimental workflow on high-dimensional data rather than a closed logical chain; any concerns about potential data leakage in feature selection fall under methodological risk, not the enumerated circularity patterns. The derivation remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

The performance numbers rest on an empirical pipeline whose critical choices (network depth, learning-rate schedule, exact train-test partitioning, and the stopping criterion for recursive elimination) are not supplied; these choices function as free parameters that directly determine the reported accuracy.

free parameters (2)

number of retained features = 15
Top-15 is presented as the operating point; the abstract gives no justification or sensitivity analysis for this cutoff.
DNN architecture and training hyperparameters
Required to compute the loss gradients used for ranking; unspecified in the abstract.

axioms (1)

standard math The classification loss is differentiable with respect to each input radiomic feature.
Necessary for the gradient-sensitivity step described in the abstract.

pith-pipeline@v0.9.1-grok · 5827 in / 1562 out tokens · 30030 ms · 2026-06-28T07:13:05.377661+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

33 extracted references · 12 canonical work pages

[1]

Radiomics

EXTRACTION OF RADIOMIC FEATURES USING 3D SLICER PyRadiomics EXTENSION The following steps are designed to compute radiomic features of a lung CT DICOM file using 3D Slicer PyRadiomics extension and to save in a file of comma separated value (csv) format. 1.1. Install 3D Slicer Install and open 3D Slicer (use the latest stable release from https://download...
[2]

DEVELOPING A RADIOMICS-BASED CANCER DETECTION MODEL USING PYTHON LIBRARIES Following steps are su mmarized for a user to develop , train and test a cancer detection model using Python Libraries with radiomic features of CT datasets. 2.1. Format the radiomics dataset saved in csv format such that each row should represent one patient/sample, and each colum...
[3]

Run the Python code in Jupyter notebook

RUNNING THE JUPYTER NOTEBOOK TO BUILD AND TEST MODEL 3.1. Run the Python code in Jupyter notebook. A prompt is received to upload the radiomics csv file as shown in Figure 4. 3.2. Upload the radiomics.csv 3.3. Save the classification results and generated graphs. REPRESENTATIVE RESULTS: DATASET SUMMERY The NSCLC Radiomics dataset comprises 422 CT volumes ...
[4]

Deep learning and MRI biomarkers for precise lung Cancer cell detection and diagnosis

Kumar S, Singh J, Ravi V, Singh P, Al Mazroa A, Diwakar M, Gupta I. Deep learning and MRI biomarkers for precise lung Cancer cell detection and diagnosis. The Open Bioinformatics Journal. 2024 Sep 19;17(1).Avanzo M, Stancanello J, Pirrone G. Radiomics and deep learning in lung cancer. Strahlenther Onkol. 2020;1 96:879–887

2024
[5]

Radiomics based likelihood functions for cancer diagnosis

Shakir H, Deng Y, Rasheed H, et al. Radiomics based likelihood functions for cancer diagnosis. Sci Rep. 2019; 9:9501

2019
[6]

A deep learning-based cancer survival time classifier for small datasets

Shakir H, Aijaz B, Khan TM, Hussain M. A deep learning-based cancer survival time classifier for small datasets. Computers in Biology and Medicine. 2023 Jun 1;160 :106896

2023
[7]

Applications of radiomics in precision diagnosis and treatment of oncology: opportunities and challenges

Liu Z, Wang S, Dong D, Wei J, Fang C, Zhou X, et al. Applications of radiomics in precision diagnosis and treatment of oncology: opportunities and challenges. Theranostics. 2019;9(5):1303–1322

2019
[8]

Insights into radiomics: impact of feature selection and classification

Perniciano A, Loddo A, Di Ruberto C, et al. Insights into radiomics: impact of feature selection and classification. Multimed Tools Appl. 2025; 84:31695–31721

2025
[9]

Radiomic feature selection for lung cancer classifiers

Shakir H, Rasheed H, Khan TMR. Radiomic feature selection for lung cancer classifiers. J Intell Fuzzy Syst. 2020;38(5):5847–5855

2020
[10]

Analyzing the impact of feature selection methods on machine learning algorithms for heart disease prediction

Noroozi Z, Orooji A, Erfannia L. Analyzing the impact of feature selection methods on machine learning algorithms for heart disease prediction. Sci Rep. 2023; 13:22588

2023
[11]

A comprehensive survey on feature selection in va rious fields of machine learning

Dhal P, Azad C. A comprehensive survey on feature selection in va rious fields of machine learning. Appl Intell. 2022; 52:4543–4581

2022
[12]

Artificial intelligence: deep learning in oncological radiomics and challenges of interpretability and data harmonization

Papadimitroulas P, Brocki L, Chung NC, Marchadour W, Vermet F, Gaubert L, et al. Artificial intelligence: deep learning in oncological radiomics and challenges of interpretability and data harmonization. Phys Med. 2021; 83:108–121

2021
[13]

Preselection of robust radiomic features does not improve outcome modelling in non-small cell lung cancer based on clinical routine FDG -PET imaging

Oliveira C, et al. Preselection of robust radiomic features does not improve outcome modelling in non-small cell lung cancer based on clinical routine FDG -PET imaging. EJNMMI Res. 2021; 11:79

2021
[14]

Ensemble feature selection and classification methods for machine learning -based coronary artery disease diagnosis

Kolukisa B, Bakir -Gungor B . Ensemble feature selection and classification methods for machine learning -based coronary artery disease diagnosis. Comput Stand Interfaces . 2023;84: 103706. doi: 10.1016/j.csi.2022.103706

work page doi:10.1016/j.csi.2022.103706 2023
[15]

Feature selection may improve deep neural networks for the bioinformatics problems

Chen Z, Pang M, Zhao Z, Li S, Miao R, Zhang Y, Feng X, Feng X, Zhang Y, Duan M, Huang L. Feature selection may improve deep neural networks for the bioinformatics problems. Bioinformatics. 2020 Mar;36(5):1542-52

2020
[16]

Feature selection using deep neural networks

Roy D, Murty KSR, Mohan CK. Feature selection using deep neural networks. In: Proceedings of the International Joint Conference on Neural Networks (IJCNN); 2015; Killarney, Ireland. p. 1–6. doi:10.1109/IJCNN.2015.7280626

work page doi:10.1109/ijcnn.2015.7280626 2015
[17]

Lung cancer detection using VGG16 and CNN

Kapoor V, Mittal A, Garg S, Diwakar M, Mishra AK, Singh P. Lung cancer detection using VGG16 and CNN. In2023 IEEE World Conference on Applied Intelligence and Computing (AIC) 2023 Jul 29 (pp. 758-762). IEEE

2023
[18]

A Comprehensive Review of CNN and Multimodal AI Techniques for Skin and Lung Cancer Diagnosis

Mishra N, Diwakar M, Roka S, Pandey NK, Singh P, Arya C. A Comprehensive Review of CNN and Multimodal AI Techniques for Skin and Lung Cancer Diagnosis. In2025 IEEE 7th International Conference on Computing, Communication and Automation (ICCCA) 2025 Nov 28 (pp. 1-4). IEEE

2025
[19]

A CAD system for lung cancer detection using hybrid deep learning techniques

Alsheikhy AA, Said Y, Shawly T, Alzahrani AK, Lahza H. A CAD system for lung cancer detection using hybrid deep learning techniques. Diagnostics (Basel). 2023;13(6):1174. doi:10.3390/diagnostics13061174

work page doi:10.3390/diagnostics13061174 2023
[20]

Overall staging prediction for non -small cell lung cancer (NSCLC): a local pilot study with artificial neural network approach

Cheung EYW, Kwong VHY, Ng KCF, Lui MKY, Li VTW, Lee RST, et al. Overall staging prediction for non -small cell lung cancer (NSCLC): a local pilot study with artificial neural network approach. Cancers (Basel). 2025;17(3):523. doi:10.3390/cancers17030523

work page doi:10.3390/cancers17030523 2025
[21]

, author Wee, L

Hugo A, et al. Data from NSCLC -Radiomics. The Cancer Imaging Arch ive. 2015. Available from: https://doi.org/10.7937/K9/TCIA.2015.PF0M9REI

work page doi:10.7937/k9/tcia.2015.pf0m9rei 2015
[22]

Computational radiomics system to decode the radiographic phenotype

Van Griethuysen JJM, et al. Computational radiomics system to decode the radiographic phenotype. Cancer Res. 2017;77: e104–e107

2017
[23]

3D Slicer: a platform for subject -specific image analysis, visualization, and clinical support

Kikinis R, Pieper SD, Vosburgh KG. 3D Slicer: a platform for subject -specific image analysis, visualization, and clinical support. In: Jolesz F, editor. Intraoperative Imaging and Image - Guided Therapy. New York: Springer; 2014

2014
[24]

Benchmarking various radiomic toolkit features while applying the image biomarker standardization initiative toward clinical translation of radiomic analysis

Lei M, Varghese B, Hwang D, Cen S, Lei X, Desai B, et al. Benchmarking various radiomic toolkit features while applying the image biomarker standardization initiative toward clinical translation of radiomic analysis. J Digit Imaging . 2021;34(5):1156 –1170. doi:10.1007/s10278-021-00506-6

work page doi:10.1007/s10278-021-00506-6 2021
[25]

A theoretical distribution analysis of synthetic minority oversampling technique (SMOTE) for imbalanced learning

Elreedy D, Atiya AF, Kamalov F . A theoretical distribution analysis of synthetic minority oversampling technique (SMOTE) for imbalanced learning. Mach Learn . 2024;113: 4903–

2024
[26]

doi:10.1007/s10994-022-06296-4

work page doi:10.1007/s10994-022-06296-4
[27]

Transactions of the Association for Computational Linguistics , volume=

Opitz J. A closer look at classification evaluation metrics and a critical reflection of common evaluation practice. Trans Assoc Comput Linguist . 2024; 12:820–836. doi:10.1162/tacl_a_00675

work page doi:10.1162/tacl_a_00675 2024
[28]

Integrative radiomics clustering analysis to decipher breast cancer heterogeneity and prognostic indicators through multiparametric MRI

He Y, Duan S, Wang W, et al. Integrative radiomics clustering analysis to decipher breast cancer heterogeneity and prognostic indicators through multiparametric MRI. NPJ Breast Cancer. 2024; 10:72. doi:10.1038/s41523-024-00678-8

work page doi:10.1038/s41523-024-00678-8 2024
[29]

Predicting sport event outcomes using deep learning

Gao J, Cheng Y, Gao J. Predicting sport event outcomes using deep learning. PeerJ Comput Sci. 2025;11: e3011. doi:10.7717/peerj-cs.3011

work page doi:10.7717/peerj-cs.3011 2025
[30]

Spie charts, target plots, and radar plots fo r displaying comparative outcomes of health care

Stafoggia M, Lallo A, Fusco D, Barone AP, D'Ovidio M, Sorge C, et al. Spie charts, target plots, and radar plots fo r displaying comparative outcomes of health care. J Clin Epidemiol . 2011;64(7):770–778. doi:10.1016/ j.jclinepi.2010.10.009

2011
[31]

Review of feature selection approaches based on grouping of features

Kuzudisli C, Bakir -Gungor B, Bulut N, Qaqish B, Yousef M. Review of feature selection approaches based on grouping of features. PeerJ. 2023; 11:e15666. doi:10.7717/peerj.15666

work page doi:10.7717/peerj.15666 2023
[32]

Biomedical physics & engineering express

Raptis S, Ilioudis C, Theodorou K . Biomedical physics & engineering express. Biomed Phys Eng Express. 2024;10(3):035016

2024
[33]

Feature selection using autoencoders

Tomar D, Prasad Y, Thakur MK, Biswas KK. Feature selection using autoencoders. In: Proceedings of the International Conference on Machine Learning and Data Science (MLDS); 2017; Noida, India. p. 56–60. doi:10.1109/MLDS.2017.20. Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Figure 8 Figure 9 Table 1 CT Data Sets Summary (NSCLC Radiomics Da...

work page doi:10.1109/mlds.2017.20 2017

[1] [1]

Radiomics

EXTRACTION OF RADIOMIC FEATURES USING 3D SLICER PyRadiomics EXTENSION The following steps are designed to compute radiomic features of a lung CT DICOM file using 3D Slicer PyRadiomics extension and to save in a file of comma separated value (csv) format. 1.1. Install 3D Slicer Install and open 3D Slicer (use the latest stable release from https://download...

[2] [2]

DEVELOPING A RADIOMICS-BASED CANCER DETECTION MODEL USING PYTHON LIBRARIES Following steps are su mmarized for a user to develop , train and test a cancer detection model using Python Libraries with radiomic features of CT datasets. 2.1. Format the radiomics dataset saved in csv format such that each row should represent one patient/sample, and each colum...

[3] [3]

Run the Python code in Jupyter notebook

RUNNING THE JUPYTER NOTEBOOK TO BUILD AND TEST MODEL 3.1. Run the Python code in Jupyter notebook. A prompt is received to upload the radiomics csv file as shown in Figure 4. 3.2. Upload the radiomics.csv 3.3. Save the classification results and generated graphs. REPRESENTATIVE RESULTS: DATASET SUMMERY The NSCLC Radiomics dataset comprises 422 CT volumes ...

[4] [4]

Deep learning and MRI biomarkers for precise lung Cancer cell detection and diagnosis

Kumar S, Singh J, Ravi V, Singh P, Al Mazroa A, Diwakar M, Gupta I. Deep learning and MRI biomarkers for precise lung Cancer cell detection and diagnosis. The Open Bioinformatics Journal. 2024 Sep 19;17(1).Avanzo M, Stancanello J, Pirrone G. Radiomics and deep learning in lung cancer. Strahlenther Onkol. 2020;1 96:879–887

2024

[5] [5]

Radiomics based likelihood functions for cancer diagnosis

Shakir H, Deng Y, Rasheed H, et al. Radiomics based likelihood functions for cancer diagnosis. Sci Rep. 2019; 9:9501

2019

[6] [6]

A deep learning-based cancer survival time classifier for small datasets

Shakir H, Aijaz B, Khan TM, Hussain M. A deep learning-based cancer survival time classifier for small datasets. Computers in Biology and Medicine. 2023 Jun 1;160 :106896

2023

[7] [7]

Applications of radiomics in precision diagnosis and treatment of oncology: opportunities and challenges

Liu Z, Wang S, Dong D, Wei J, Fang C, Zhou X, et al. Applications of radiomics in precision diagnosis and treatment of oncology: opportunities and challenges. Theranostics. 2019;9(5):1303–1322

2019

[8] [8]

Insights into radiomics: impact of feature selection and classification

Perniciano A, Loddo A, Di Ruberto C, et al. Insights into radiomics: impact of feature selection and classification. Multimed Tools Appl. 2025; 84:31695–31721

2025

[9] [9]

Radiomic feature selection for lung cancer classifiers

Shakir H, Rasheed H, Khan TMR. Radiomic feature selection for lung cancer classifiers. J Intell Fuzzy Syst. 2020;38(5):5847–5855

2020

[10] [10]

Analyzing the impact of feature selection methods on machine learning algorithms for heart disease prediction

Noroozi Z, Orooji A, Erfannia L. Analyzing the impact of feature selection methods on machine learning algorithms for heart disease prediction. Sci Rep. 2023; 13:22588

2023

[11] [11]

A comprehensive survey on feature selection in va rious fields of machine learning

Dhal P, Azad C. A comprehensive survey on feature selection in va rious fields of machine learning. Appl Intell. 2022; 52:4543–4581

2022

[12] [12]

Artificial intelligence: deep learning in oncological radiomics and challenges of interpretability and data harmonization

Papadimitroulas P, Brocki L, Chung NC, Marchadour W, Vermet F, Gaubert L, et al. Artificial intelligence: deep learning in oncological radiomics and challenges of interpretability and data harmonization. Phys Med. 2021; 83:108–121

2021

[13] [13]

Preselection of robust radiomic features does not improve outcome modelling in non-small cell lung cancer based on clinical routine FDG -PET imaging

Oliveira C, et al. Preselection of robust radiomic features does not improve outcome modelling in non-small cell lung cancer based on clinical routine FDG -PET imaging. EJNMMI Res. 2021; 11:79

2021

[14] [14]

Ensemble feature selection and classification methods for machine learning -based coronary artery disease diagnosis

Kolukisa B, Bakir -Gungor B . Ensemble feature selection and classification methods for machine learning -based coronary artery disease diagnosis. Comput Stand Interfaces . 2023;84: 103706. doi: 10.1016/j.csi.2022.103706

work page doi:10.1016/j.csi.2022.103706 2023

[15] [15]

Feature selection may improve deep neural networks for the bioinformatics problems

Chen Z, Pang M, Zhao Z, Li S, Miao R, Zhang Y, Feng X, Feng X, Zhang Y, Duan M, Huang L. Feature selection may improve deep neural networks for the bioinformatics problems. Bioinformatics. 2020 Mar;36(5):1542-52

2020

[16] [16]

Feature selection using deep neural networks

Roy D, Murty KSR, Mohan CK. Feature selection using deep neural networks. In: Proceedings of the International Joint Conference on Neural Networks (IJCNN); 2015; Killarney, Ireland. p. 1–6. doi:10.1109/IJCNN.2015.7280626

work page doi:10.1109/ijcnn.2015.7280626 2015

[17] [17]

Lung cancer detection using VGG16 and CNN

Kapoor V, Mittal A, Garg S, Diwakar M, Mishra AK, Singh P. Lung cancer detection using VGG16 and CNN. In2023 IEEE World Conference on Applied Intelligence and Computing (AIC) 2023 Jul 29 (pp. 758-762). IEEE

2023

[18] [18]

A Comprehensive Review of CNN and Multimodal AI Techniques for Skin and Lung Cancer Diagnosis

Mishra N, Diwakar M, Roka S, Pandey NK, Singh P, Arya C. A Comprehensive Review of CNN and Multimodal AI Techniques for Skin and Lung Cancer Diagnosis. In2025 IEEE 7th International Conference on Computing, Communication and Automation (ICCCA) 2025 Nov 28 (pp. 1-4). IEEE

2025

[19] [19]

A CAD system for lung cancer detection using hybrid deep learning techniques

Alsheikhy AA, Said Y, Shawly T, Alzahrani AK, Lahza H. A CAD system for lung cancer detection using hybrid deep learning techniques. Diagnostics (Basel). 2023;13(6):1174. doi:10.3390/diagnostics13061174

work page doi:10.3390/diagnostics13061174 2023

[20] [20]

Overall staging prediction for non -small cell lung cancer (NSCLC): a local pilot study with artificial neural network approach

Cheung EYW, Kwong VHY, Ng KCF, Lui MKY, Li VTW, Lee RST, et al. Overall staging prediction for non -small cell lung cancer (NSCLC): a local pilot study with artificial neural network approach. Cancers (Basel). 2025;17(3):523. doi:10.3390/cancers17030523

work page doi:10.3390/cancers17030523 2025

[21] [21]

, author Wee, L

Hugo A, et al. Data from NSCLC -Radiomics. The Cancer Imaging Arch ive. 2015. Available from: https://doi.org/10.7937/K9/TCIA.2015.PF0M9REI

work page doi:10.7937/k9/tcia.2015.pf0m9rei 2015

[22] [22]

Computational radiomics system to decode the radiographic phenotype

Van Griethuysen JJM, et al. Computational radiomics system to decode the radiographic phenotype. Cancer Res. 2017;77: e104–e107

2017

[23] [23]

3D Slicer: a platform for subject -specific image analysis, visualization, and clinical support

Kikinis R, Pieper SD, Vosburgh KG. 3D Slicer: a platform for subject -specific image analysis, visualization, and clinical support. In: Jolesz F, editor. Intraoperative Imaging and Image - Guided Therapy. New York: Springer; 2014

2014

[24] [24]

Benchmarking various radiomic toolkit features while applying the image biomarker standardization initiative toward clinical translation of radiomic analysis

Lei M, Varghese B, Hwang D, Cen S, Lei X, Desai B, et al. Benchmarking various radiomic toolkit features while applying the image biomarker standardization initiative toward clinical translation of radiomic analysis. J Digit Imaging . 2021;34(5):1156 –1170. doi:10.1007/s10278-021-00506-6

work page doi:10.1007/s10278-021-00506-6 2021

[25] [25]

A theoretical distribution analysis of synthetic minority oversampling technique (SMOTE) for imbalanced learning

Elreedy D, Atiya AF, Kamalov F . A theoretical distribution analysis of synthetic minority oversampling technique (SMOTE) for imbalanced learning. Mach Learn . 2024;113: 4903–

2024

[26] [26]

doi:10.1007/s10994-022-06296-4

work page doi:10.1007/s10994-022-06296-4

[27] [27]

Transactions of the Association for Computational Linguistics , volume=

Opitz J. A closer look at classification evaluation metrics and a critical reflection of common evaluation practice. Trans Assoc Comput Linguist . 2024; 12:820–836. doi:10.1162/tacl_a_00675

work page doi:10.1162/tacl_a_00675 2024

[28] [28]

Integrative radiomics clustering analysis to decipher breast cancer heterogeneity and prognostic indicators through multiparametric MRI

He Y, Duan S, Wang W, et al. Integrative radiomics clustering analysis to decipher breast cancer heterogeneity and prognostic indicators through multiparametric MRI. NPJ Breast Cancer. 2024; 10:72. doi:10.1038/s41523-024-00678-8

work page doi:10.1038/s41523-024-00678-8 2024

[29] [29]

Predicting sport event outcomes using deep learning

Gao J, Cheng Y, Gao J. Predicting sport event outcomes using deep learning. PeerJ Comput Sci. 2025;11: e3011. doi:10.7717/peerj-cs.3011

work page doi:10.7717/peerj-cs.3011 2025

[30] [30]

Spie charts, target plots, and radar plots fo r displaying comparative outcomes of health care

Stafoggia M, Lallo A, Fusco D, Barone AP, D'Ovidio M, Sorge C, et al. Spie charts, target plots, and radar plots fo r displaying comparative outcomes of health care. J Clin Epidemiol . 2011;64(7):770–778. doi:10.1016/ j.jclinepi.2010.10.009

2011

[31] [31]

Review of feature selection approaches based on grouping of features

Kuzudisli C, Bakir -Gungor B, Bulut N, Qaqish B, Yousef M. Review of feature selection approaches based on grouping of features. PeerJ. 2023; 11:e15666. doi:10.7717/peerj.15666

work page doi:10.7717/peerj.15666 2023

[32] [32]

Biomedical physics & engineering express

Raptis S, Ilioudis C, Theodorou K . Biomedical physics & engineering express. Biomed Phys Eng Express. 2024;10(3):035016

2024

[33] [33]

Feature selection using autoencoders

Tomar D, Prasad Y, Thakur MK, Biswas KK. Feature selection using autoencoders. In: Proceedings of the International Conference on Machine Learning and Data Science (MLDS); 2017; Noida, India. p. 56–60. doi:10.1109/MLDS.2017.20. Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Figure 8 Figure 9 Table 1 CT Data Sets Summary (NSCLC Radiomics Da...

work page doi:10.1109/mlds.2017.20 2017