pith. sign in

arxiv: 1907.04822 · v1 · pith:XM7Q4U7Inew · submitted 2019-07-10 · 🧬 q-bio.QM · cs.CV· eess.IV

Improving Prognostic Performance in Resectable Pancreatic Ductal Adenocarcinoma using Radiomics and Deep Learning Features Fusion in CT Images

Pith reviewed 2026-05-24 23:09 UTC · model grok-4.3

classification 🧬 q-bio.QM cs.CVeess.IV
keywords radiomicsdeep learningfeature fusionpancreatic ductal adenocarcinomaprognosisCT imagingsurvival predictionPDAC
0
0 comments X

The pith

A risk-score based fusion of radiomics and deep learning features from CT images improves overall survival prognosis in resectable pancreatic ductal adenocarcinoma.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops and tests a feature fusion pipeline that merges predefined radiomics features with features learned by deep neural networks from preoperative CT scans. It evaluates the fused set against single-feature banks and against standard reduction or fusion techniques on cohorts of patients with surgically resectable PDAC, using overall survival as the endpoint. The central demonstration is that the proposed fusion lifts predictive performance measured by area under the ROC curve. A sympathetic reader would care because more accurate pre-operative risk stratification could inform decisions about resection versus other therapies in a disease where outcomes remain poor. The work directly addresses whether hand-crafted radiomics retain additive value once deep representations are available.

Core claim

The risk-score based feature fusion method applied to a combined bank of deep learning and predefined radiomics features extracted from CT images of resectable PDAC patients significantly improves prognosis performance for overall survival, elevating the area under the ROC curve by 51 percent compared with predefined radiomics features alone, by 16 percent compared with deep learning features alone, and by 32 percent compared with existing feature fusion and reduction methods.

What carries the argument

The risk-score based feature fusion method, which derives a composite score from complementary information in radiomics and deep learning feature banks to avoid redundancy.

If this is right

  • The fused feature bank supplies information for survival prediction that is not present in either radiomics or deep learning features alone.
  • The risk-score fusion outperforms common dimensionality-reduction and concatenation approaches on the same combined feature set.
  • Improved AUC translates directly to better separation of high-risk and low-risk resectable PDAC patients before surgery.
  • The method can be applied to the same CT acquisition protocol already used in routine staging.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The result implies that traditional radiomics descriptors and convolutional representations encode partly distinct aspects of tumor biology visible on CT.
  • The same fusion strategy could be tested on other solid tumors or on longitudinal imaging to monitor treatment response.
  • If the performance lift holds, the composite score could be inserted into existing nomograms or electronic health record alerts without requiring new hardware.

Load-bearing premise

The assumption that the risk-score fusion reliably captures non-redundant predictive information from the two feature families and that performance gains are not artifacts of how competing methods were selected or tuned.

What would settle it

An independent validation cohort of resectable PDAC patients in which the proposed fusion produces no statistically significant AUC gain over the stronger of the two single-feature sets or over the best alternative fusion method.

read the original abstract

As an analytic pipeline for quantitative imaging feature extraction and analysis, radiomics has grown rapidly in the past a few years. Recent studies in radiomics aim to investigate the relationship between tumors imaging features and clinical outcomes. Open source radiomics feature banks enable the extraction and analysis of thousands of predefined features. On the other hand, recent advances in deep learning have shown significant potential in the quantitative medical imaging field, raising the research question of whether predefined radiomics features have predictive information in addition to deep learning features. In this study, we propose a feature fusion method and investigate whether a combined feature bank of deep learning and predefined radiomics features can improve the prognostics performance. CT images from resectable Pancreatic Adenocarcinoma (PDAC) patients were used to compare the prognosis performance of common feature reduction and fusion methods and the proposed risk-score based feature fusion method for overall survival. It was shown that the proposed feature fusion method significantly improves the prognosis performance for overall survival in resectable PDAC cohorts, elevating the area under ROC curve by 51% compared to predefined radiomics features alone, by 16% compared to deep learning features alone, and by 32% compared to existing feature fusion and reduction methods for a combination of deep learning and predefined radiomics features.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes a risk-score based feature fusion method to combine deep learning features and predefined radiomics features extracted from CT images of resectable pancreatic ductal adenocarcinoma (PDAC) patients. It claims this fusion significantly improves prognostic performance for overall survival, with AUC gains of 51% over radiomics features alone, 16% over deep learning features alone, and 32% over existing feature fusion/reduction methods.

Significance. If the reported AUC improvements are shown to be robust under proper validation, the work would contribute to understanding whether handcrafted radiomics and deep learning features provide complementary prognostic information in PDAC, potentially informing multimodal imaging biomarker development. The paper does not ship machine-checked proofs or parameter-free derivations.

major comments (2)
  1. [Abstract] Abstract: the central claim of AUC improvements (51%, 16%, 32%) is presented without any report of cohort size, validation strategy (e.g., cross-validation or held-out test), statistical tests, error bars, or exclusion criteria. These details are load-bearing for assessing whether the risk-score fusion captures non-redundant information or reflects overfitting on a small PDAC cohort.
  2. [Abstract] Abstract and methods description: no information is given on how the risk-score is computed or whether the compared fusion/reduction methods were pre-specified versus selected post-hoc after inspecting performance on the evaluation data. This directly affects the validity of the 32% superiority claim over existing methods.
minor comments (1)
  1. [Abstract] Abstract: minor phrasing issue in 'past a few years' should be 'past few years'.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. We address each major comment below and agree that the abstract requires additional details for clarity and to allow independent assessment of the claims.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central claim of AUC improvements (51%, 16%, 32%) is presented without any report of cohort size, validation strategy (e.g., cross-validation or held-out test), statistical tests, error bars, or exclusion criteria. These details are load-bearing for assessing whether the risk-score fusion captures non-redundant information or reflects overfitting on a small PDAC cohort.

    Authors: We agree these elements are essential. The full manuscript reports a single-institution cohort of resectable PDAC patients, employs cross-validation, applies statistical testing for comparisons, and specifies exclusion criteria. We will revise the abstract to explicitly state the cohort size, validation approach, statistical testing, and exclusion criteria, along with noting that performance metrics include appropriate variability measures where applicable. revision: yes

  2. Referee: [Abstract] Abstract and methods description: no information is given on how the risk-score is computed or whether the compared fusion/reduction methods were pre-specified versus selected post-hoc after inspecting performance on the evaluation data. This directly affects the validity of the 32% superiority claim over existing methods.

    Authors: The Methods section details the risk-score computation as a linear combination of the fused radiomics and deep learning features with coefficients derived from a Cox proportional hazards model. The comparator fusion and reduction methods were pre-specified from the existing literature prior to any performance evaluation. We will add a brief description of the risk-score computation to the abstract and explicitly state that the comparators were pre-specified to strengthen the 32% claim. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical comparison of feature fusion methods with no derivations or self-referential reductions

full rationale

The paper is an applied empirical study comparing radiomics, deep learning, and a proposed risk-score fusion method on CT images for PDAC prognosis. The abstract and described content contain no equations, no parameter-fitting steps presented as predictions, and no load-bearing self-citations or uniqueness theorems. Performance claims (AUC improvements) are statistical comparisons on patient cohorts rather than algebraic identities or fitted inputs renamed as outputs. The central claim rests on experimental results, not on any derivation chain that reduces to its own inputs by construction. This is the expected non-finding for a methods-comparison paper without theoretical derivations.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no free parameters, axioms, or invented entities are described or can be inferred.

pith-pipeline@v0.9.0 · 5797 in / 1132 out tokens · 30846 ms · 2026-05-24T23:09:38.554616+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

56 extracted references · 56 canonical work pages

  1. [1]

    higher-order

    Several cross-cohort and multi-centre studies have shown that serval PyRadiomics features are robust to different scanners and clinician annotations8,15,19,20. Despite recent progress, traditional radiomics analytics pipeline has few drawbacks. First, the formulas of features are predefined, and thus, can be very similar to one another. This leads to high...

  2. [2]

    Yip, S. S. F. & Aerts, H. J. W. L. Applications and limitations of radiomics. Phys. Med. Biol. 61, R150-66 (2016)

  3. [3]

    & Aerts, H

    Parmar, C., Grossmann, P., Bussink, J., Lambin, P. & Aerts, H. J. W. L. Machine Learning methods for Quantitative Radiomic Biomarkers. Sci. Rep. 5, 13087 (2015)

  4. [4]

    Kumar, V. et al. Radiomics: the process and the challenges. Magn. Reson. Imaging 30, 1234–1248 (2012)

  5. [5]

    Aerts, H. J. W. L. et al. Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach. Nat. Commun. 5, 4006 (2014)

  6. [6]

    Aerts, H. J. W. L. The Potential of Radiomic-Based Phenotyping in Precision Medicine. JAMA Oncol. 2, 1636 (2016)

  7. [7]

    Hawkins, S. et al. Predicting Malignant Nodules from Screening CT Scans. J. Thorac. Oncol. 11, 2120–2128 (2016)

  8. [8]

    Eilaghi, A. et al. CT texture features are associated with overall survival in pancreatic ductal adenocarcinoma – a quantitative analysis. BMC Med. Imaging 17, 38 (2017)

  9. [9]

    Khalvati, F. et al. Prognostic Value of CT Radiomic Features in Resectable Pancreatic Ductal Adenocarcinoma. Nat. Sci. Reports (2019). doi:10.1038/s41598-019-41728-7

  10. [10]

    Zhang, Y., Oikonomou, A., Wong, A., Haider, M. A. & Khalvati, F. Radiomics-based Prognosis Analysis for Non-Small Cell Lung Cancer. Nat. Sci. Reports 7, (2017)

  11. [11]

    Lambin, P. et al. Radiomics: Extracting more information from medical images using advanced feature analysis. Eur. J. Cancer 48, 441–446 (2012)

  12. [12]

    Oikonomou, A., Khalvati, F. & et al. Radiomics analysis at PET/CT contributes to prognosis of recurrence and survival in lung cancer treated with stereotactic body radiotherapy. Sci. Rep. 8, (2018)

  13. [13]

    Haider, M. A. et al. CT texture analysis: a potential tool for prediction of survival in patients with metastatic clear cell carcinoma treated with sunitinib. Cancer Imaging 17, (2017)

  14. [14]

    & Haider, M

    Khalvati, F., Zhang, Y., Wong, A. & Haider, M. A. Radiomics. in Encyclopedia of Biomedical Engineering 2, 597–603 (2019)

  15. [15]

    Lambin, P. et al. Radiomics: the bridge between medical imaging and personalized medicine. Nat. Rev. Clin. Oncol. 14, 749–762 (2017)

  16. [16]

    Van Griethuysen, J. J. M. et al. Computational radiomics system to decode the radiographic phenotype. Cancer Res. 77, e104–e107 (2017)

  17. [17]

    Aerts, H. J. et al. Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach. Nat Commun 5, 4006 (2014)

  18. [18]

    Li, Y. et al. MRI features predict p53 status in lower-grade gliomas via a machine- learning approach. NeuroImage Clin. 17, 306–311 (2018)

  19. [19]

    Li, H. et al. MR Imaging Radiomics Signatures for Predicting the Risk of Breast Cancer Recurrence as Given by Research Versions of MammaPrint, Oncotype DX, and PAM50 Gene Assays. Radiology 281, 382–391 (2016)

  20. [20]

    Parmar, C. et al. Robust Radiomics Feature Quantification Using Semiautomatic Volumetric Segmentation. 9, 1–8 (2014)

  21. [21]

    & Gillies, R

    Traverso, A., Wee, L., Dekker, A. & Gillies, R. Repeatability and Reproducibility of Radiomic Features: A Systematic Review. Int. J. Radiat. Oncol. 102, 1143–1158 (2018)

  22. [22]

    Sanduleanu, S. et al. Tracking tumor biology with radiomics: A systematic review utilizing a radiomics quality score. Radiotherapy and Oncology 127, 349–360 (2018)

  23. [23]

    Chen, S.-Y., Feng, Z. & Yi, X. A general introduction to adjustment for multiple comparisons. J. Thorac. Dis. 9, 1725–1729 (2017)

  24. [24]

    & Hinton, G

    Krizhevsky, A., Sutskever, I. & Hinton, G. E. Classification with Deep Convolutional Neural Networks. in

  25. [25]

    Yamashita, R., Nishio, M., Do, R. K. G. & Togashi, K. Convolutional neural networks: an overview and application in radiology. Insights Imaging 9, 611–629 (2018)

  26. [26]

    Irvin, J. et al. CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison

  27. [27]

    Pan, S. J. & Yang, Q. A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22, 1345–1359 (2010)

  28. [28]

    Tan, C. et al. A Survey on Deep Transfer Learning. (2018)

  29. [29]

    & Dollár, P

    He, K., Girshick, R. & Dollár, P. Rethinking ImageNet Pre-training. (2018)

  30. [30]

    HUBEL, D. H. & WIESEL, T. N. Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. J. Physiol. 160, 106–54 (1962)

  31. [31]

    & Huerta, E

    George, D., Shen, H. & Huerta, E. A. Deep Transfer Learning: A new deep learning glitch classification method for advanced LIGO. (2017)

  32. [32]

    & Shavlik, J

    Torrey, L. & Shavlik, J. Transfer Learning

  33. [33]

    Lao, J. et al. A Deep Learning-Based Radiomics Model for Prediction of Survival in Glioblastoma Multiforme. Sci. Rep. 7, 10353 (2017)

  34. [34]

    Zhang, Y. et al. CNN-based Survival Model for Pancreatic Ductal Adenocarcinoma in Medical Imaging. arXiv (2019)

  35. [35]

    Zhang, Y. et al. Improving Prognostic Value of CT Deep Radiomic Features in Pancreatic Ductal Adenocarcinoma Using Transfer Learning. arXiv (2019)

  36. [36]

    Kursa, M. B. & Rudnicki, W. R. Feature Selection with the Boruta Package. J. Stat. Softw. 36, 1–13 (2010)

  37. [37]

    & Williams, L

    Abdi, H. & Williams, L. J. Principal component analysis. Wiley Interdiscip. Rev. Comput. Stat. 2, 433–459 (2010)

  38. [38]

    Cox, D. R. Regression Models and Life-Tables. Journal of the Royal Statistical Society. Series B (Methodological) 34, 187–220 (1972)

  39. [39]

    Hira, Z. M. & Gillies, D. F. A Review of Feature Selection and Feature Extraction Methods Applied on Microarray Data. 2015, (2015)

  40. [40]

    & Saeys, Y

    Larran, P. & Saeys, Y. A review of feature selection techniques in bioinformatics. 23, 2507–2517 (2007)

  41. [41]

    Zhang, J., Baig, S., Wong, A., Haider, M. A. & Khalvati, F. A Local ROI-specific Atlas- based Segmentation of Prostate Gland and Transitional Zone in Diffusion MRI. J. Comput. Vis. Imaging Syst. 2, (2016)

  42. [42]

    2016, in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 770--778, 10.1109/CVPR.2016.90

    He, K., Zhang, X., Ren, S. & Sun, J. Deep Residual Learning for Image Recognition. in IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016). doi:10.1109/CVPR.2016.90

  43. [43]

    & Others, A

    Chollet, F. & Others, A. Keras. (2015)

  44. [44]

    & Sun, J

    He, K., Zhang, X., Ren, S. & Sun, J. Deep Residual Learning for Image Recognition. in IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016). doi:10.3389/fpsyg.2013.00124

  45. [46]

    Armato, S. G. et al. The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI): a completed reference database of lung nodules on CT scans. Med. Phys. 38, 915–31 (2011)

  46. [47]

    Deng, J. et al. ImageNet: A large-scale hierarchical image database. in 2009 IEEE Conference on Computer Vision and Pattern Recognition 248–255 (IEEE, 2009). doi:10.1109/CVPR.2009.5206848

  47. [48]

    Random Forests

    Breiman, L. Random Forests. 1–33 (2001)

  48. [49]

    An introduction to ROC analysis

    Fawcett, T. An introduction to ROC analysis. (2005). doi:10.1016/j.patrec.2005.10.010

  49. [50]

    R., DeLong, D

    DeLong, E. R., DeLong, D. M. & Clarke-Pearson, D. L. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 44, 837–45 (1988)

  50. [51]

    Robin, X. et al. pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics 12, 77 (2011)

  51. [52]

    Mukaka, M. M. Statistics corner: A guide to appropriate use of correlation coefficient in medical research. Malawi Med. J. 24, 69–71 (2012)

  52. [53]

    Breiman, L. & Leo. Stacked regressions. Mach. Learn. 24, 49–64 (1996)

  53. [54]

    Dietterich, T. G. Ensemble Methods in Machine Learning. in 1–15 (Springer, Berlin, Heidelberg, 2000). doi:10.1007/3-540-45014-9_1

  54. [55]

    Ensemble Methods for Classifiers

    Rokach, L. Ensemble Methods for Classifiers. in Data Mining and Knowledge Discovery Handbook 957–980 (Springer-Verlag, 2005). doi:10.1007/0-387-25465-X_45

  55. [56]

    & Shen, D

    Suk, H.-I. & Shen, D. Deep Ensemble Sparse Regression Network for Alzheimer’s Disease Diagnosis. in 113–121 (2016). doi:10.1007/978-3-319-47157-0_14

  56. [57]

    H., Zhou, B

    Yang, P., Yang, Y. H., Zhou, B. B. & Zomaya, A. Y. A review of ensemble methods in bioinformatics: * Including stability of feature selection and ensemble feature selection methods (updated on 28 Sep. 2016)