Recognition: 2 theorem links
Virtual Scanning for NSCLC Histology: Investigating the Discriminatory Power of Synthetic PET
Pith reviewed 2026-05-08 18:26 UTC · model grok-4.3
The pith
Synthetic PET volumes generated from CT scans raise AUC for NSCLC subtype classification from 0.489 to 0.591.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Synthetic PET data produced by a pretrained 3D Pix2Pix GAN supplies complementary metabolic features that, when fused with CT via the MINT architecture, raise AUC from 0.489 to 0.591 and GMean from 0.305 to 0.524 for histological subtype classification on 714 subjects.
What carries the argument
The 3D Pix2Pix GAN that maps anatomical CT to pseudo-PET volumes, integrated through the MINT multi-stage intermediate fusion architecture.
If this is right
- Synthetic metabolic volumes can function as a low-cost feature enhancer in settings where physical PET scans are not performed.
- Multimodal models can exploit cross-modal metabolic information even when one modality is synthesized rather than measured.
- The approach offers a route to reduce radiation exposure and imaging costs while preserving subtype discrimination accuracy.
Where Pith is reading between the lines
- The same virtual-scanning step could be applied to other PET-dependent tasks such as tumor staging or treatment response monitoring once the metabolic fidelity is verified.
- Direct comparison of synthetic versus real PET on a held-out paired cohort would quantify how much metabolic signal is preserved versus hallucinated.
- If the improvement holds, protocols could default to CT-only acquisition and synthesize PET on demand specifically for histology classification.
Load-bearing premise
The synthetic PET volumes reflect genuine metabolic differences between histological subtypes instead of GAN artifacts or patterns learned only from the pretraining set.
What would settle it
Performance on real PET/CT test cases falls back to the CT-only baseline when the model is forced to use only the synthetic volumes or when the synthetic volumes are replaced by noise matched in intensity distribution.
Figures
read the original abstract
Accurate histological differentiation between adenocarcinoma (ADC) and squamous cell carcinoma (SCC) is critical for personalized treatment in non-small cell lung cancer (NSCLC). While [$^{18}$F]FDG PET/CT is a standard tool for the clinical evaluation of lung cancer, its utility is often limited by high costs and radiation exposure. In this paper, we investigate the feasibility of "virtual scanning" as a feature-enhancement strategy by evaluating whether synthetic PET data can provide complementary feature representations to supplement anatomical CT scans in histological subtype classification. We propose a framework that leverages a 3D Pix2Pix Generative Adversarial Network (GAN), pretrained on the FDG-PET/CT Lesions dataset, to synthesize pseudo-PET volumes from anatomical CT scans. These synthetic volumes are integrated with structural CT data within the MINT framework, a multi-stage intermediate fusion architecture. Our experiments, conducted on a multi-center dataset of 714 subjects, demonstrate that the inclusion of synthetic metabolic features significantly improves classification performance over a CT-only baseline. The multimodal approach achieved a statistically significant increase in the Area Under the Curve (AUC) from 0.489 to 0.591 and improved the Geometric Mean (GMean) from 0.305 to 0.524. These results suggest that synthetic PET scans provide discriminatory metabolic cues that enable deep learning models to exploit complementary cross-modal information, offering a potential feature-enhancement strategy for clinical scenarios where physical PET scans are unavailable.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes synthesizing pseudo-PET volumes from CT scans via a 3D Pix2Pix GAN pretrained on the FDG-PET/CT Lesions dataset, then fusing them with structural CT using the MINT multi-stage intermediate fusion architecture to improve binary classification of NSCLC histological subtypes (adenocarcinoma vs. squamous cell carcinoma). Experiments on a 714-subject multi-center cohort report that the multimodal model yields a statistically significant AUC increase from 0.489 (CT-only) to 0.591 and GMean increase from 0.305 to 0.524, suggesting that synthetic metabolic features supply complementary discriminatory information.
Significance. If the performance gains are shown to derive from genuine metabolic subtype cues rather than fusion artifacts or GAN-induced correlations, the work could provide a practical feature-enhancement strategy for histology classification when physical PET is unavailable. The multi-center scale of the cohort is a strength that supports potential clinical translation, though the absence of isolating controls limits current interpretability of the claimed metabolic benefit.
major comments (3)
- [Abstract] Abstract: The assertion of a 'statistically significant' AUC rise (0.489 to 0.591) and GMean rise (0.305 to 0.524) is presented without any description of the train/test partitioning, cross-validation scheme, exact hypothesis test, or class-imbalance handling. These omissions directly undermine evaluation of whether the multimodal lift is robust or reproducible.
- [Abstract] Abstract and Methods: No ablation or control experiments are reported to test whether the observed gains originate from metabolic content in the synthetic PET volumes or from the MINT fusion architecture itself. Controls such as noise-augmented CT inputs, permuted synthetic volumes, or direct comparison against a duplicated-CT channel would be required to support the claim that synthetic PET supplies genuine complementary metabolic signal.
- [Abstract] Abstract: The 3D Pix2Pix GAN is pretrained unsupervised on a separate FDG-PET/CT dataset without histology labels. Consequently, any apparent metabolic appearance in the outputs could reflect learned anatomical correlates of subtype that happen to co-vary in the pretraining distribution rather than true FDG uptake differences relevant to ADC/SCC. Additional analyses (e.g., correlation of synthetic SUV with known metabolic markers or external-cohort testing) are needed to substantiate the interpretation.
minor comments (1)
- [Abstract] The abstract would benefit from explicitly reporting the class distribution (number of ADC vs. SCC cases) to allow readers to contextualize the GMean values and the degree of imbalance addressed.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. We agree that greater methodological transparency, control experiments, and careful discussion of limitations are needed to strengthen the claims regarding the value of synthetic PET. We will revise the manuscript accordingly, expanding the abstract, methods, results, and discussion sections while maintaining the core contribution on multi-center evaluation.
read point-by-point responses
-
Referee: [Abstract] Abstract: The assertion of a 'statistically significant' AUC rise (0.489 to 0.591) and GMean rise (0.305 to 0.524) is presented without any description of the train/test partitioning, cross-validation scheme, exact hypothesis test, or class-imbalance handling. These omissions directly undermine evaluation of whether the multimodal lift is robust or reproducible.
Authors: We agree that these details are critical for reproducibility and statistical evaluation. In the revised manuscript we will expand both the abstract and methods to explicitly describe the stratified 70/30 train/test partitioning, the 5-fold cross-validation scheme, the use of DeLong's test for AUC significance, and the class-weighted loss employed to address imbalance. These additions will directly address the concern and allow readers to assess robustness. revision: yes
-
Referee: [Abstract] Abstract and Methods: No ablation or control experiments are reported to test whether the observed gains originate from metabolic content in the synthetic PET volumes or from the MINT fusion architecture itself. Controls such as noise-augmented CT inputs, permuted synthetic volumes, or direct comparison against a duplicated-CT channel would be required to support the claim that synthetic PET supplies genuine complementary metabolic signal.
Authors: We acknowledge the absence of isolating controls in the current version. We will add ablation experiments to the revised results section, including a noise-augmented CT baseline and a duplicated-CT-channel control to test whether gains arise from the fusion architecture alone. These controls will be reported alongside the main results to better substantiate the contribution of the synthetic PET. revision: yes
-
Referee: [Abstract] Abstract: The 3D Pix2Pix GAN is pretrained unsupervised on a separate FDG-PET/CT dataset without histology labels. Consequently, any apparent metabolic appearance in the outputs could reflect learned anatomical correlates of subtype that happen to co-vary in the pretraining distribution rather than true FDG uptake differences relevant to ADC/SCC. Additional analyses (e.g., correlation of synthetic SUV with known metabolic markers or external-cohort testing) are needed to substantiate the interpretation.
Authors: This is a substantive limitation of the current study. The pretraining was performed unsupervised on an independent dataset without subtype labels, so the synthetic volumes may indeed encode anatomical patterns that co-vary with histology rather than pure metabolic signal. In the revision we will add an explicit limitations paragraph in the discussion that states this caveat, clarifies the scope of our claims, and outlines future work (supervised pretraining or post-hoc SUV correlation studies). We cannot perform the suggested new analyses within the current revision timeline but will strengthen the interpretive framing. revision: partial
Circularity Check
No circularity: empirical performance on held-out test set
full rationale
The paper reports an empirical machine-learning study: a 3D Pix2Pix GAN is pretrained on an external FDG-PET/CT Lesions dataset to synthesize PET volumes from CT, these volumes are fused with CT inside the MINT architecture, and classification performance (AUC, GMean) is measured on a held-out multi-center test set of 714 subjects. The central claim is the observed lift from 0.489 to 0.591 AUC when synthetic PET is added. This lift is a direct experimental measurement, not a quantity obtained by solving the paper's own equations or by renaming a fitted parameter. No derivation chain, uniqueness theorem, or self-citation load-bearing step is present; the result is falsifiable by re-running the pipeline on the same splits. The absence of additional ablation controls is a validity concern, not a circularity issue.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The FDG-PET/CT Lesions dataset used for GAN pretraining is representative of the metabolic patterns in the 714-subject NSCLC test cohort.
Reference graph
Works this paper leans on
-
[1]
Fatih Aksu, Fabrizia Gelardi, Arturo Chiti, and Paolo Soda. 2024. Toward a multimodal deep learning approach for histological subtype classification in NSCLC. In2024 IEEE international conference on bioinformatics and biomedicine (BIBM). IEEE, 6327–6333
2024
-
[2]
Fatih Aksu, Fabrizia Gelardi, Arturo Chiti, and Paolo Soda. 2025. Multi-stage intermediate fusion for multimodal learning to classify non-small cell lung cancer subtypes from CT and PET.Pattern Recognition Letters(2025)
2025
-
[3]
Fatih Aksu, Fabrizia Gelardi, Arturo Chiti, and Paolo Soda. 2025. NSCLC histolog- ical subtype classification from CT scans using generalist 3D medical foundation models. In2025 IEEE 13th International Conference on Healthcare Informatics (ICHI). IEEE, 648–653
2025
-
[4]
Gerald Antoch, Jorg Stattaus, Andre T Nemat, Simone Marnitz, Thomas Beyer, Hilmar Kuehl, Andreas Bockisch, Jörg F Debatin, and Lutz S Freudenberg. 2003. Non–small cell lung cancer: dual-modality PET/CT in preoperative staging.Ra- diology229, 2 (2003), 526–533
2003
-
[5]
Shaimaa Bakr, Olivier Gevaert, Sebastian Echegaray, Kelsey Ayers, Mu Zhou, Majid Shafiq, Hong Zheng, Weiruo Zhang, Ann Leung, Michael Kadoch, Joseph Shrager, Andrew Quon, Daniel Rubin, Sylvia Plevritis, and Sandy Napel. 2017. Data for NSCLC Radiogenomics Collection (Version 4) [Data set]. https://www. cancerimagingarchive.net/collection/nsclc-radiogenomics/
2017
-
[6]
Joshua D Campbell, Anton Alexandrov, Jaegil Kim, Jeremiah Wala, Alice H Berger, Chandra Sekhar Pedamallu, Sachet A Shukla, Guangwu Guo, Angela N Brooks, and Bradley A Murray. 2016. Distinct patterns of somatic genome alterations in lung adenocarcinomas and squamous cell carcinomas.Nature Genetics48, 6 (2016), 607–616
2016
-
[7]
Kari Chansky, Jean-Paul Sculier, John J Crowley, Dori Giroux, Jan Van Meerbeeck, and Peter Goldstraw. 2009. The International Association for the Study of Lung Cancer Staging Project: prognostic factors and pathologic TNM stage in surgically managed non-small cell lung cancer.Journal of Thoracic Oncology4, 7 (2009), 792–801
2009
-
[8]
Salman UH Dar, Mahmut Yurt, Levent Karacan, Aykut Erdem, Erkut Erdem, and Tolga Cukur. 2019. Image synthesis in multi-contrast MRI with conditional generative adversarial networks.IEEE transactions on medical imaging38, 10 (2019), 2375–2388
2019
-
[9]
Sanuwani Dayarathna, Kh Tohidul Islam, Sergio Uribe, Guang Yang, Munawar Hayat, and Zhaolin Chen. 2024. Deep learning based synthesis of MRI, CT and PET: Review and analysis.Medical image analysis92 (2024), 103046
2024
-
[10]
C de Margerie-Mellon, C De Bazelaire, and E De Kerviler. 2016. Image-guided biopsy in primary lung cancer: Why, when and how.Diagnostic and Interventional Imaging97, 10 (2016), 965–972
2016
-
[11]
Francesco Di Feola, Lorenzo Tronchin, and Paolo Soda. 2023. A comparative study between paired and unpaired Image Quality Assessment in Low-Dose CT Denoising. In2023 IEEE 36th International Symposium on Computer-Based Medical Systems (CBMS). IEEE, 471–476
2023
-
[12]
S. Gatidis and T. Kuestner. 2022. A whole-body FDG-PET/CT dataset with manually annotated tumor lesions (FDG-PET-CT-Lesions). The Cancer Imaging Archive [Dataset]. doi:10.7937/gkr0-xv29
-
[13]
Valerio Guarrasi, Fatih Aksu, Camillo Maria Caruso, Francesco Di Feola, Aurora Rofena, Filippo Ruffini, and Paolo Soda. 2025. A systematic review of intermediate fusion in multimodal deep learning for biomedical applications.Image and Vision Computing(2025), 105509
2025
-
[14]
Valerio Guarrasi, Francesco Di Feola, Rebecca Restivo, Lorenzo Tronchin, and Paolo Soda. 2025. Whole-Body Image-to-Image Translation for a Virtual Scanner in a Healthcare Digital Twin. In2025 IEEE 38th International Symposium on Computer-Based Medical Systems (CBMS). IEEE, 528–534
2025
-
[15]
Johannes Hofmanninger, Forian Prayer, Jeanny Pan, Sebastian Röhrich, Helmut Prosch, and Georg Langs. 2020. Automatic lung segmentation in routine imaging is primarily a data diversity problem, not a methodology problem.European Radiology Experimental4 (2020), 1–13
2020
-
[16]
Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2017. Image-to- image translation with conditional adversarial networks. InProceedings of the IEEE conference on computer vision and pattern recognition. 1125–1134
2017
-
[17]
Binghu Jiang, Shodayu Takashima, Chie Miyake, Tomoaki Hakucho, Yoshiyuki Takahashi, Daisuke Morimoto, Hodaka Numasaki, Katsuyuki Nakanishi, Yasuhiko Tomita, and Masahiko Higashiyama. 2014. Thin-section CT findings in peripheral lung cancer of 3 cm or smaller: are there any characteristic features for predicting tumor histology or do they depend only on tu...
2014
-
[18]
Margarita Kirienko, Luca Cozzi, Lidija Antunovic, Lisa Lozza, Antonella Fogliata, Emanuele Voulaz, Alexia Rossi, Arturo Chiti, and Martina Sollini. 2018. Prediction of disease-free survival by the PET/CT radiomic signature in non-small cell lung cancer patients undergoing surgery.European Journal of Nuclear Medicine and Molecular Imaging45 (2018), 207–217
2018
-
[19]
Ping Li, Shuo Wang, Tang Li, Jingfeng Lu, Yunxin HuangFu, and Dongxue Wang
-
[20]
https://www.cancerimagingarchive.net/collection/lung-pet-ct-dx/
A Large-Scale CT and PET/CT Dataset for Lung Cancer Diagnosis [Data set]. https://www.cancerimagingarchive.net/collection/lung-pet-ct-dx/
-
[21]
RuoXi Qin, Zhenzhen Wang, LingYun Jiang, Kai Qiao, Jinjin Hai, Jian Chen, Jun- ling Xu, Dapeng Shi, and Bin Yan. 2020. Fine-Grained Lung Cancer Classification from PET and CT Images Based on Multidimensional Attention Mechanism. Complexity2020, 1 (2020), 6153657
2020
-
[22]
Aurora Rofena, Valerio Guarrasi, Marina Sarli, Claudia Lucia Piccolo, Matteo Sam- marra, Bruno Beomonte Zobel, and Paolo Soda. 2024. A deep learning approach for virtual contrast enhancement in Contrast Enhanced Spectral Mammography. Computerized Medical Imaging and Graphics116 (2024), 102398
2024
-
[23]
Lalith Kumar Shiyam Sundar, Josef Yu, Otto Muzik, Oana C Kulterer, Barbara Fueger, Daria Kifjak, Thomas Nakuz, Hyung Min Shin, Annika Katharina Sima, Daniela Kitzmantl, et al. 2022. Fully automated, semantic segmentation of whole- body 18F-FDG PET/CT images based on data-centric artificial intelligence.Jour- nal of Nuclear Medicine63, 12 (2022), 1941–1948
2022
-
[24]
Satoshi Takeuchi, Benjapa Khiewvan, Patricia S Fox, Stephen G Swisher, Eric M Rohren, Roland L Bassett, and Homer A Macapinlac. 2014. Impact of initial PET/CT staging in terms of clinical stage, management plan, and prognosis in 592 patients with non-small-cell lung cancer.European Journal of Nuclear Medicine and Molecular Imaging41 (2014), 906–914
2014
-
[25]
The International Agency for Research on Cancer (IARC). 2024. Global cancer observatory. https://gco.iarc.fr/
2024
-
[26]
Selene Tomassini, Nicola Falcionelli, Paolo Sernani, Laura Burattini, and Aldo Franco Dragoni. 2022. Lung nodule diagnosis and cancer histology classifi- cation from computed tomography data by convolutional neural networks: A survey.Computers in Biology and Medicine146 (2022), 105691
2022
-
[27]
Du Tran, Heng Wang, Lorenzo Torresani, Jamie Ray, Yann LeCun, and Manohar Paluri. 2018. A closer look at spatiotemporal convolutions for action recognition. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6450–6459
2018
-
[28]
Duhig, and Douglas B Flieder
William D Travis, Elisabeth Brambilla, Andrew G Nicholson, Yasushi Yatabe, John HM Austin, Mary Beth Beasley, Lucian R Chirieac, Sanja Dacic, Edwina , , Aksu et al. Duhig, and Douglas B Flieder. 2015. The 2015 World Health Organization classi- fication of lung tumors: impact of genetic, clinical and radiologic advances since the 2004 classification.Journa...
2015
-
[29]
Bing-Yen Wang, Jing-Yang Huang, Heng-Chung Chen, Ching-Hsiung Lin, Sheng- Hao Lin, Wei-Heng Hung, and Ya-Fu Cheng. 2020. The comparison between adenocarcinoma and squamous cell carcinoma in lung cancer patients.Journal of cancer research and clinical oncology146, 1 (2020), 43–52
2020
-
[30]
Ge Wang, Andreu Badal, Xun Jia, Jonathan S Maltz, Klaus Mueller, Kyle J Myers, Chuang Niu, Michael Vannier, Pingkun Yan, Zhou Yu, et al. 2022. Development of metaverse for intelligent healthcare.Nature machine intelligence4, 11 (2022), 922–929
2022
-
[31]
Weimiao Wu, Chintan Parmar, Patrick Grossmann, John Quackenbush, Philippe Lambin, Johan Bussink, Raymond Mak, and Hugo JWL Aerts. 2016. Exploratory study to identify radiomics classifiers for lung cancer histology.Frontiers in oncology6 (2016), 71
2016
-
[32]
Xinzhong Zhu, Di Dong, Zhendong Chen, Mengjie Fang, Liwen Zhang, Jiangdian Song, Dongdong Yu, Yali Zang, Zhenyu Liu, Jingyun Shi, et al. 2018. Radiomic signature as a diagnostic factor for histologic subtype classification of non-small cell lung cancer.European radiology28, 7 (2018), 2772–2778
2018
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.