Radiology-Report Semantic Modelling and Host-Response Laboratory Biomarkers for Multimodal Survival Prediction in Lung Cancer

Feng-Ming (Spring) Kong; Gen Yang; Jingxiang Shi; Weihua Meng; Xiaoyan Li; Yan Zhang; Yiming Wang; Yuqi Ma; Zhengda Li

arxiv: 2606.13043 · v1 · pith:ZO5ANGJWnew · submitted 2026-06-11 · ⚛️ physics.med-ph

Radiology-Report Semantic Modelling and Host-Response Laboratory Biomarkers for Multimodal Survival Prediction in Lung Cancer

Jingxiang Shi , Yiming Wang , Zhengda Li , Yan Zhang , Weihua Meng , Yuqi Ma , Xiaoyan Li , Feng-Ming (Spring) Kong

show 1 more author

Gen Yang

This is my paper

Pith reviewed 2026-06-27 05:22 UTC · model grok-4.3

classification ⚛️ physics.med-ph

keywords lung cancersurvival predictionmultimodal modelradiology reportslaboratory biomarkersTNM stagingrandom survival forestsMC-BERT

0 comments

The pith

A multimodal score fusing radiology-report semantics with lab biomarkers predicts lung cancer survival and stratifies patients within TNM stages.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops and tests a multimodal adaptive risk score (AMRS) that encodes radiology reports with a domain-adapted MC-BERT model and fuses the resulting semantic features with routinely collected clinical and laboratory variables through Mahalanobis imputation and random survival forests. In a retrospective two-center cohort of 574 patients the score reaches C-index values of 0.920 in training and 0.849 in testing while separating survival curves inside clinical subgroups and TNM strata. A sympathetic reader would care because TNM staging alone leaves large outcome heterogeneity unexplained, yet the AMRS uses data already generated in standard care. SHAP analysis points to hematologic, inflammatory, coagulation, nutritional, tumor-marker, organ-function, and age-related variables as the main drivers. The authors conclude that the approach may complement anatomic staging in imaging-centered workflows once prospective validation is completed.

Core claim

In a retrospective two-center cohort of 574 lung cancer patients the AMRS, built by encoding radiology reports with MC-BERT, imputing laboratory variables with Mahalanobis distance, modeling with random survival forests, and performing weighted risk fusion, achieved C-indexes of 0.920 (training) and 0.849 (test) and separated survival trajectories across TNM-related strata.

What carries the argument

The multimodal adaptive risk score (AMRS), which encodes radiology reports with MC-BERT and fuses them with Mahalanobis-imputed laboratory variables via random survival forests and weighted fusion.

If this is right

AMRS can separate survival outcomes inside the same TNM stage, enabling finer risk stratification without new tests.
SHAP identifies hematologic, inflammatory, coagulation, nutritional, tumor-marker, organ-function, and age-related variables as the dominant contributors.
The fusion approach may be inserted into existing imaging-centered oncology workflows that already produce radiology reports.
Prospective validation, calibration checks, and ablation testing are required before any clinical deployment.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the weights prove stable across sites, the method could lower dependence on costly genomic assays by exploiting data already collected in routine care.
The same report-plus-biomarker fusion pattern might transfer to other solid tumors where radiology reports are standard.
Site-specific recalibration may be needed because the two-center retrospective design leaves open the possibility of unmeasured selection bias.

Load-bearing premise

The retrospective two-center cohort after exclusion of patients with short follow-up or missing reports remains representative of the target population and the learned fusion weights will generalize to new patients and sites.

What would settle it

A prospective multi-center validation study in which the AMRS C-index drops below 0.75 or fails to separate survival curves inside TNM strata.

Figures

Figures reproduced from arXiv: 2606.13043 by Feng-Ming (Spring) Kong, Gen Yang, Jingxiang Shi, Weihua Meng, Xiaoyan Li, Yan Zhang, Yiming Wang, Yuqi Ma, Zhengda Li.

**Figure 2.** Figure 2: Multimodal AMRS workflow. Radiology reports are encoded with a domain [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Discrimination of AMRS and benchmark survival models. C- [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 4.** Figure 4: Clinical-laboratory feature importance. SHAP-based ranking of variables in the clinical risk branch, including hematologic indices, electrolytes, tumor markers, inflammatory variables, coagulation markers, nutritional markers, organ-function measures, and age. AMRS separated survival across clinical subgroups The next analysis tested whether AMRS remained informative beyond the overall cohort. A clinically… view at source ↗

**Figure 5.** Figure 5: Survival stratification across clinical subgroups. Kaplan [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

**Figure 6.** Figure 6: TNM-related risk refinement. Kaplan-Meier curves for AMRS-defined risk groups within Group 1 and Group 3. Fused AMRS components were associated with overall survival The final analysis examined whether the learned fused representation contained components individually associated with survival. Among 32 fused components, six reached P < 0.05 in univariate Cox regression: fused_0 (P = 0.0105), fused_1 (P = 0… view at source ↗

**Figure 7.** Figure 7: Cox analysis of fused AMRS components. Univariate Cox regression P [PITH_FULL_IMAGE:figures/full_fig_p010_7.png] view at source ↗

read the original abstract

TNM staging is essential for lung cancer management, but patients within the same anatomic stage often show heterogeneous survival outcomes. We developed a multimodal adaptive risk score (AMRS) that integrates radiology-report semantics with routinely available clinical laboratory biomarkers. In a retrospective two-center cohort, 1129 patients diagnosed between December 2017 and February 2026 were screened; 574 patients were included after exclusion for short follow-up or missing imaging reports and were split into training (n = 459) and test (n = 115) cohorts. Radiology reports were encoded with a domain-adapted MC-BERT branch to capture imaging-derived semantic information, while clinical and laboratory variables were modeled after Mahalanobis-distance-based imputation using random survival forests. Weighted risk fusion generated the final patient-level score. AMRS achieved C-index values of 0.920 in training and 0.849 in testing, and separated survival trajectories across clinical subgroups and TNM-related strata. SHAP analysis identified hematologic, inflammatory, coagulation, nutritional, tumor-marker, organ-function, and age-related contributors. AMRS may complement TNM staging in imaging-centered oncology workflows, but prospective validation, calibration, ablation testing, and clinical-utility assessment are required before deployment.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gets a test C-index of 0.849 with its report-plus-lab fusion but the single split after dropping more than half the screened patients leaves the performance claim open to selection bias and overfitting.

read the letter

The paper presents a multimodal adaptive risk score that encodes radiology reports with domain-adapted MC-BERT, imputes missing lab values via Mahalanobis distance, fits random survival forests, and fuses the outputs with learned weights. What is new is the concrete pipeline that combines semantic report features with routine biomarkers in one score for lung cancer. It does a reasonable job showing the resulting AMRS separates survival curves across subgroups and TNM strata in their data, and the SHAP breakdown identifies plausible contributors from hematologic, inflammatory, and other lab categories.

The practical angle is that everything uses data already collected in standard oncology care, so the approach could slot into existing imaging workflows without new tests.

The soft spots are the design choices that limit how much we can trust the numbers. They screened 1129 patients but kept only 574 after excluding short follow-up or missing reports, then used one 459/115 split. No ablations, no calibration plots, no direct TNM baseline comparison, and no external or center-stratified validation appear in the abstract. The train-to-test drop from 0.920 to 0.849 is consistent with some overfitting to the retained longer-surviving cases. The fusion weights are also fit on the training distribution, which adds another layer of dependence even if test performance is reported separately.

This is for researchers building multimodal prognostic tools in lung cancer who want a worked example of report semantics plus labs. A reader could extract the encoding and imputation steps as a template, but the evidence is too preliminary to support strong claims about added value over TNM.

I would send it to peer review. The core idea is straightforward and the reported test performance is not trivial, but referees will need to require ablations, repeated splits or external validation, and clearer handling of the exclusions before the central claim holds up.

Referee Report

3 major / 1 minor

Summary. The manuscript describes the development of a multimodal adaptive risk score (AMRS) for survival prediction in lung cancer. It integrates semantic information from radiology reports using a domain-adapted MC-BERT model with clinical laboratory biomarkers processed via Mahalanobis-distance imputation and random survival forests. In a retrospective cohort of 574 patients from two centers (after screening 1129 and exclusions for short follow-up or missing reports), split 459/115 train/test, the AMRS achieves C-indices of 0.920 and 0.849 respectively, and demonstrates separation of survival curves across subgroups and TNM strata. SHAP analysis highlights contributions from various biomarker categories.

Significance. If the reported performance generalizes, this work could provide a valuable complement to TNM staging by incorporating imaging-derived semantics and routine labs into a unified risk score for better stratification within anatomic stages. The multimodal fusion approach and use of SHAP for identifying contributors from hematologic, inflammatory, and other categories are positive elements that address a clinically relevant problem of outcome heterogeneity.

major comments (3)

[Abstract] Abstract: The central performance claims rest on C-index values of 0.920 (training) and 0.849 (testing) from a single random split of a retrospective cohort after excluding 555 of 1129 screened patients for short follow-up or missing reports. No k-fold CV, repeated splits, center-stratified hold-out, or external validation is described, which is load-bearing for the generalizability claim and leaves the results vulnerable to selection bias and overfitting in the complex pipeline (MC-BERT adaptation, imputation, RSF, weighted fusion).
[Abstract] Abstract: No ablation results, baseline comparisons to TNM staging alone or to unimodal models, or calibration plots are supplied. This undermines assessment of the incremental contribution of the radiology-report semantics and the reliability of the AMRS predictions across the reported subgroups and TNM strata.
[Abstract] Abstract: The final AMRS is generated by weighted risk fusion whose weights are fitted on the training distribution; while test-set performance is reported separately, the abstract provides no indication that fusion parameters were fixed prior to test evaluation or that the model avoids circularity in the 0.849 C-index.

minor comments (1)

[Abstract] Abstract: The screening date range ending in February 2026 appears inconsistent with a retrospective study and may require clarification or correction.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive comments on our manuscript. We address each major comment point by point below and indicate the revisions to be incorporated.

read point-by-point responses

Referee: [Abstract] Abstract: The central performance claims rest on C-index values of 0.920 (training) and 0.849 (testing) from a single random split of a retrospective cohort after excluding 555 of 1129 screened patients for short follow-up or missing reports. No k-fold CV, repeated splits, center-stratified hold-out, or external validation is described, which is load-bearing for the generalizability claim and leaves the results vulnerable to selection bias and overfitting in the complex pipeline (MC-BERT adaptation, imputation, RSF, weighted fusion).

Authors: We acknowledge that the reported results rely on a single random train-test split. This was selected given the retrospective design and post-exclusion sample size. To address the concern, we will add k-fold cross-validation within the training cohort and report the resulting C-indices in the revised manuscript. Center-stratified hold-out was not performed as the split was random; external validation is noted as required future work in the abstract. revision: yes
Referee: [Abstract] Abstract: No ablation results, baseline comparisons to TNM staging alone or to unimodal models, or calibration plots are supplied. This undermines assessment of the incremental contribution of the radiology-report semantics and the reliability of the AMRS predictions across the reported subgroups and TNM strata.

Authors: We agree these elements are needed to demonstrate incremental value. The revised manuscript will include ablation experiments, comparisons against TNM staging alone and the two unimodal models (radiology-report semantics and laboratory biomarkers), and calibration plots to assess prediction reliability across subgroups and TNM strata. revision: yes
Referee: [Abstract] Abstract: The final AMRS is generated by weighted risk fusion whose weights are fitted on the training distribution; while test-set performance is reported separately, the abstract provides no indication that fusion parameters were fixed prior to test evaluation or that the model avoids circularity in the 0.849 C-index.

Authors: The weighted fusion parameters were fitted exclusively on the training set and held fixed for the independent test set, ensuring no circularity. We will revise the abstract and methods section to state this explicitly and remove any ambiguity. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper describes a standard ML pipeline: MC-BERT encoding of reports, Mahalanobis imputation, random survival forests, and weighted fusion, all fit on the training split (n=459) with C-index reported separately on the held-out test split (n=115). No equations or steps reduce the reported test performance or AMRS construction to the inputs by definition. No self-citations are invoked as load-bearing uniqueness theorems. The train C-index is transparently labeled as training performance rather than presented as an independent prediction. The derivation remains self-contained against the external test cohort benchmark.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 0 invented entities

Only the abstract is available, so the ledger records components explicitly named or necessarily implied by the described pipeline; many implementation details remain unknown.

free parameters (2)

fusion weights
Weighted risk fusion requires parameters that combine the MC-BERT and random-forest branches; these are necessarily fitted to the training cohort.
MC-BERT domain-adaptation parameters
Domain adaptation of the BERT model on radiology reports introduces additional fitted parameters.

axioms (2)

domain assumption Radiology reports contain extractable semantic features that are prognostic for survival beyond TNM stage.
Invoked by the choice to encode reports with MC-BERT and fuse the output into the risk score.
domain assumption Laboratory biomarkers supply independent prognostic signal after Mahalanobis imputation.
Core premise of the multimodal design.

pith-pipeline@v0.9.1-grok · 5783 in / 1525 out tokens · 25304 ms · 2026-06-27T05:22:30.059907+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

37 extracted references · 31 canonical work pages

[1]

Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries

Bray F, Laversanne M, Sung H, et al. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2024;74:229-

2022
[2]

doi:10.3322/caac.21834

work page doi:10.3322/caac.21834
[3]

The biology and management of non-small cell lung cancer

Herbst RS, Morgensztern D, Boshoff C. The biology and management of non-small cell lung cancer. Nature. 2018;553:446-454. doi:10.1038/nature25183

work page doi:10.1038/nature25183 2018
[4]

Rami-Porta R, Nishimura KK, Giroux DJ, et al. The International Association for the Study of Lung Cancer Lung Cancer Staging Project: proposals for revision of the TNM stage groups in the forthcoming ninth edition of the TNM classification for lung cancer. J Thorac Oncol. 2024;19:1007-

2024
[5]

doi:10.1016/j.jtho.2024.02.011

work page doi:10.1016/j.jtho.2024.02.011 2024
[6]

The proposed ninth edition TNM classification of lung cancer

Detterbeck FC, Woodard GA, Bader AS, et al. The proposed ninth edition TNM classification of lung cancer. Chest. 2024;166:882-895. doi:10.1016/j.chest.2024.05.026

work page doi:10.1016/j.chest.2024.05.026 2024
[7]

Proposed ninth edition TNM staging system for lung cancer: guide for radiologists

Klug M, Kirshenboim Z, Truong MT, et al. Proposed ninth edition TNM staging system for lung cancer: guide for radiologists. Radiographics. 2024;44:e240057. doi:10.1148/rg.240057

work page doi:10.1148/rg.240057 2024
[8]

Cancer-related inflammation

Mantovani A, Allavena P, Sica A, Balkwill F. Cancer-related inflammation. Nature. 2008;454:436-

2008
[9]

doi:10.1038/nature07205

work page doi:10.1038/nature07205
[10]

Artificial intelligence for multimodal data integration in oncology

Lipkova J, Chen RJ, Chen B, et al. Artificial intelligence for multimodal data integration in oncology. Cancer Cell. 2022;40:1095-1110. doi:10.1016/j.ccell.2022.09.012

work page doi:10.1016/j.ccell.2022.09.012 2022
[11]

Maron, Mohamed Ahmed, Susie Kim, Mono Pirun, Walid K

Jee J, Fong C, Pichotta K, et al. Automated real-world data integration improves cancer outcome prediction. Nature. 2024;636:728-736. doi:10.1038/s41586-024-08167-5

work page doi:10.1038/s41586-024-08167-5 2024
[12]

TRIPOD+AI statement: Updated guidance for reporting clinical prediction models that use regression or machine learning methods

Collins GS, Moons KGM, Dhiman P, et al. TRIPOD+AI statement: updated guidance for reporting clinical prediction models that use regression or machine learning methods. BMJ. 2024;385:e078378. doi:10.1136/bmj-2023-078378

work page doi:10.1136/bmj-2023-078378 2024
[13]

Automated extraction of information of lung cancer staging from unstructured reports of PET-CT interpretation: natural language processing with deep-learning

Park HJ, Park N, Lee JH, et al. Automated extraction of information of lung cancer staging from unstructured reports of PET-CT interpretation: natural language processing with deep-learning. BMC Med Inform Decis Mak. 2022;22:229. doi:10.1186/s12911-022-01975-7

work page doi:10.1186/s12911-022-01975-7 2022
[14]

Kogalur, Eugene H

Ishwaran H, Kogalur UB, Blackstone EH, Lauer MS. Random survival forests. Ann Appl Stat. 2008;2:841-860. doi:10.1214/08-AOAS169

work page doi:10.1214/08-aoas169 2008
[15]

Regression models and life-tables

Cox DR. Regression models and life-tables. J R Stat Soc Series B Stat Methodol. 1972;34:187-220

1972
[16]

Katzman, Uri Shaham, Alexander Cloninger, Jonathan Bates, Tingting Jiang, and Yuval Kluger

Katzman JL, Shaham U, Cloninger A, et al. DeepSurv: personalized treatment recommender system using a Cox proportional hazards deep neural network. BMC Med Res Methodol. 2018;18:24. doi:10.1186/s12874-018-0482-1

work page doi:10.1186/s12874-018-0482-1 2018
[17]

In: Proceedings of the AAAI Conference on Artificial Intelligence

Lee C, Zame W, Yoon J, van der Schaar M. DeepHit: a deep learning approach to survival analysis with competing risks. Proc AAAI Conf Artif Intell. 2018;32. doi:10.1609/aaai.v32i1.11842

work page doi:10.1609/aaai.v32i1.11842 2018
[18]

BERT: pre-training of deep bidirectional transformers for language understanding

Devlin J, Chang MW, Lee K, Toutanova K. BERT: pre-training of deep bidirectional transformers for language understanding. Proc NAACL-HLT. 2019:4171-4186

2019
[19]

Bioinformatics , volume =

Lee J, Yoon W, Kim S, et al. BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics. 2020;36:1234-1240. doi:10.1093/bioinformatics/btz682

work page doi:10.1093/bioinformatics/btz682 2020
[20]

Publicly Available Clinical BERT Embeddings

Alsentzer E, Murphy J, Boag W, et al. Publicly available clinical BERT embeddings. Proc 2nd Clinical Natural Language Processing Workshop. 2019:72-78. doi:10.18653/v1/W19-1909

work page doi:10.18653/v1/w19-1909 2019
[21]

A unified approach to interpreting model predictions

Lundberg SM, Lee SI. A unified approach to interpreting model predictions. Adv Neural Inf Process Syst. 2017;30:4765-4774

2017
[22]

Systemic immune-inflammation index as a predictor of survival in non-small cell lung cancer patients undergoing immune checkpoint inhibition: a systematic review and meta-analysis

Zhang Y, Chen Y, Guo C, Li S, Huang C. Systemic immune-inflammation index as a predictor of survival in non-small cell lung cancer patients undergoing immune checkpoint inhibition: a systematic review and meta-analysis. Crit Rev Oncol Hematol. 2025;210:104669. doi:10.1016/j.critrevonc.2025.104669

work page doi:10.1016/j.critrevonc.2025.104669 2025
[23]

Association of prognostic nutritional index with long-term survival in lung cancer receiving immune checkpoint inhibitors: a meta-analysis

Wang L, Long X, Zhu Y, et al. Association of prognostic nutritional index with long-term survival in lung cancer receiving immune checkpoint inhibitors: a meta-analysis. Medicine (Baltimore). 2024;103:e41087. doi:10.1097/MD.0000000000041087

work page doi:10.1097/md.0000000000041087 2024
[24]

The D-dimer level predicts the prognosis in patients with lung cancer: a systematic review and meta-analysis

Ma M, Cao R, Wang W, et al. The D-dimer level predicts the prognosis in patients with lung cancer: a systematic review and meta-analysis. J Cardiothorac Surg. 2021;16:243. doi:10.1186/s13019-021- 01618-4

work page doi:10.1186/s13019-021- 2021
[25]

Pretreatment lactate dehydrogenase may predict outcome of advanced non-small-cell lung cancer patients treated with immune checkpoint inhibitors: a meta-analysis

Zhang Z, Li Y, Yan X, Song Q, Wang G, Hu Y. Pretreatment lactate dehydrogenase may predict outcome of advanced non-small-cell lung cancer patients treated with immune checkpoint inhibitors: a meta-analysis. Cancer Med. 2019;8:1467-1473. doi:10.1002/cam4.2024

work page doi:10.1002/cam4.2024 2019
[26]

Systemic immune-inflammation index is a promising noninvasive marker to predict survival of lung cancer: a meta-analysis

Zhang Y, Chen B, Wang L, Wang R, Yang X. Systemic immune-inflammation index is a promising noninvasive marker to predict survival of lung cancer: a meta-analysis. Medicine (Baltimore). 2019;98:e13788. doi:10.1097/MD.0000000000013788

work page doi:10.1097/md.0000000000013788 2019
[27]

Prognostic value of the systemic immune-inflammation index in lung cancer patients receiving immune checkpoint inhibitors: a meta-analysis

Yang Y, Li J, Wang Y, et al. Prognostic value of the systemic immune-inflammation index in lung cancer patients receiving immune checkpoint inhibitors: a meta-analysis. PLoS One. 2024;19:e0312605. doi:10.1371/journal.pone.0312605

work page doi:10.1371/journal.pone.0312605 2024
[28]

ESPEN guidelines on nutrition in cancer patients

Arends J, Bachmann P, Baracos V, et al. ESPEN guidelines on nutrition in cancer patients. Clin Nutr. 2017;36:11-48. doi:10.1016/j.clnu.2016.07.015

work page doi:10.1016/j.clnu.2016.07.015 2017
[29]

Higher pretreatment lactate dehydrogenase concentration predicts worse overall survival in patients with lung cancer

Deng T, Zhang J, Meng Y, et al. Higher pretreatment lactate dehydrogenase concentration predicts worse overall survival in patients with lung cancer. Medicine (Baltimore). 2018;97:e12524. doi:10.1097/MD.0000000000012524

work page doi:10.1097/md.0000000000012524 2018
[30]

Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): the TRIPOD statement

Collins GS, Reitsma JB, Altman DG, Moons KGM. Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): the TRIPOD statement. Ann Intern Med. 2015;162:55-63. doi:10.7326/M14-0697

work page doi:10.7326/m14-0697 2015
[31]

Wolff, Karel G

Wolff RF, Moons KGM, Riley RD, et al. PROBAST: a tool to assess the risk of bias and applicability of prediction model studies. Ann Intern Med. 2019;170:51-58. doi:10.7326/M18-1376

work page doi:10.7326/m18-1376 2019
[32]

Decision curve analysis: a novel method for evaluating prediction models

Vickers AJ, Elkin EB. Decision curve analysis: a novel method for evaluating prediction models. Med Decis Making. 2006;26:565-574. doi:10.1177/0272989X06295361

work page doi:10.1177/0272989x06295361 2006
[33]

Time-dependent ROC curves for censored survival data and a diagnostic marker

Heagerty PJ, Lumley T, Pepe MS. Time-dependent ROC curves for censored survival data and a diagnostic marker. Biometrics. 2000;56:337-344. doi:10.1111/j.0006-341X.2000.00337.x

work page doi:10.1111/j.0006-341x.2000.00337.x 2000
[34]

Extracting lung cancer staging descriptors from pathology reports: a generative language model approach

Cho H, Yoo S, Kim B, et al. Extracting lung cancer staging descriptors from pathology reports: a generative language model approach. J Biomed Inform. 2024;157:104720. doi:10.1016/j.jbi.2024.104720

work page doi:10.1016/j.jbi.2024.104720 2024
[35]

Uncertainty-aware automatic TNM staging classification for [18F]FDG PET-CT reports for lung cancer utilising transformer-based language models and multi- task learning

Barlow SH, Chicklore S, He Y, et al. Uncertainty-aware automatic TNM staging classification for [18F]FDG PET-CT reports for lung cancer utilising transformer-based language models and multi- task learning. BMC Med Inform Decis Mak. 2024;24:396. doi:10.1186/s12911-024-02814-7

work page doi:10.1186/s12911-024-02814-7 2024
[36]

Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach

Aerts HJWL, Velazquez ER, Leijenaar RTH, et al. Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach. Nat Commun. 2014;5:4006. doi:10.1038/ncomms5006

work page doi:10.1038/ncomms5006 2014
[37]

Calibration: the Achilles heel of predictive analytics

Van Calster B, McLernon DJ, van Smeden M, et al. Calibration: the Achilles heel of predictive analytics. BMC Med. 2019;17:230. doi:10.1186/s12916-019-1466-7

work page doi:10.1186/s12916-019-1466-7 2019

[1] [1]

Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries

Bray F, Laversanne M, Sung H, et al. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2024;74:229-

2022

[2] [2]

doi:10.3322/caac.21834

work page doi:10.3322/caac.21834

[3] [3]

The biology and management of non-small cell lung cancer

Herbst RS, Morgensztern D, Boshoff C. The biology and management of non-small cell lung cancer. Nature. 2018;553:446-454. doi:10.1038/nature25183

work page doi:10.1038/nature25183 2018

[4] [4]

Rami-Porta R, Nishimura KK, Giroux DJ, et al. The International Association for the Study of Lung Cancer Lung Cancer Staging Project: proposals for revision of the TNM stage groups in the forthcoming ninth edition of the TNM classification for lung cancer. J Thorac Oncol. 2024;19:1007-

2024

[5] [5]

doi:10.1016/j.jtho.2024.02.011

work page doi:10.1016/j.jtho.2024.02.011 2024

[6] [6]

The proposed ninth edition TNM classification of lung cancer

Detterbeck FC, Woodard GA, Bader AS, et al. The proposed ninth edition TNM classification of lung cancer. Chest. 2024;166:882-895. doi:10.1016/j.chest.2024.05.026

work page doi:10.1016/j.chest.2024.05.026 2024

[7] [7]

Proposed ninth edition TNM staging system for lung cancer: guide for radiologists

Klug M, Kirshenboim Z, Truong MT, et al. Proposed ninth edition TNM staging system for lung cancer: guide for radiologists. Radiographics. 2024;44:e240057. doi:10.1148/rg.240057

work page doi:10.1148/rg.240057 2024

[8] [8]

Cancer-related inflammation

Mantovani A, Allavena P, Sica A, Balkwill F. Cancer-related inflammation. Nature. 2008;454:436-

2008

[9] [9]

doi:10.1038/nature07205

work page doi:10.1038/nature07205

[10] [10]

Artificial intelligence for multimodal data integration in oncology

Lipkova J, Chen RJ, Chen B, et al. Artificial intelligence for multimodal data integration in oncology. Cancer Cell. 2022;40:1095-1110. doi:10.1016/j.ccell.2022.09.012

work page doi:10.1016/j.ccell.2022.09.012 2022

[11] [11]

Maron, Mohamed Ahmed, Susie Kim, Mono Pirun, Walid K

Jee J, Fong C, Pichotta K, et al. Automated real-world data integration improves cancer outcome prediction. Nature. 2024;636:728-736. doi:10.1038/s41586-024-08167-5

work page doi:10.1038/s41586-024-08167-5 2024

[12] [12]

TRIPOD+AI statement: Updated guidance for reporting clinical prediction models that use regression or machine learning methods

Collins GS, Moons KGM, Dhiman P, et al. TRIPOD+AI statement: updated guidance for reporting clinical prediction models that use regression or machine learning methods. BMJ. 2024;385:e078378. doi:10.1136/bmj-2023-078378

work page doi:10.1136/bmj-2023-078378 2024

[13] [13]

Automated extraction of information of lung cancer staging from unstructured reports of PET-CT interpretation: natural language processing with deep-learning

Park HJ, Park N, Lee JH, et al. Automated extraction of information of lung cancer staging from unstructured reports of PET-CT interpretation: natural language processing with deep-learning. BMC Med Inform Decis Mak. 2022;22:229. doi:10.1186/s12911-022-01975-7

work page doi:10.1186/s12911-022-01975-7 2022

[14] [14]

Kogalur, Eugene H

Ishwaran H, Kogalur UB, Blackstone EH, Lauer MS. Random survival forests. Ann Appl Stat. 2008;2:841-860. doi:10.1214/08-AOAS169

work page doi:10.1214/08-aoas169 2008

[15] [15]

Regression models and life-tables

Cox DR. Regression models and life-tables. J R Stat Soc Series B Stat Methodol. 1972;34:187-220

1972

[16] [16]

Katzman, Uri Shaham, Alexander Cloninger, Jonathan Bates, Tingting Jiang, and Yuval Kluger

Katzman JL, Shaham U, Cloninger A, et al. DeepSurv: personalized treatment recommender system using a Cox proportional hazards deep neural network. BMC Med Res Methodol. 2018;18:24. doi:10.1186/s12874-018-0482-1

work page doi:10.1186/s12874-018-0482-1 2018

[17] [17]

In: Proceedings of the AAAI Conference on Artificial Intelligence

Lee C, Zame W, Yoon J, van der Schaar M. DeepHit: a deep learning approach to survival analysis with competing risks. Proc AAAI Conf Artif Intell. 2018;32. doi:10.1609/aaai.v32i1.11842

work page doi:10.1609/aaai.v32i1.11842 2018

[18] [18]

BERT: pre-training of deep bidirectional transformers for language understanding

Devlin J, Chang MW, Lee K, Toutanova K. BERT: pre-training of deep bidirectional transformers for language understanding. Proc NAACL-HLT. 2019:4171-4186

2019

[19] [19]

Bioinformatics , volume =

Lee J, Yoon W, Kim S, et al. BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics. 2020;36:1234-1240. doi:10.1093/bioinformatics/btz682

work page doi:10.1093/bioinformatics/btz682 2020

[20] [20]

Publicly Available Clinical BERT Embeddings

Alsentzer E, Murphy J, Boag W, et al. Publicly available clinical BERT embeddings. Proc 2nd Clinical Natural Language Processing Workshop. 2019:72-78. doi:10.18653/v1/W19-1909

work page doi:10.18653/v1/w19-1909 2019

[21] [21]

A unified approach to interpreting model predictions

Lundberg SM, Lee SI. A unified approach to interpreting model predictions. Adv Neural Inf Process Syst. 2017;30:4765-4774

2017

[22] [22]

Systemic immune-inflammation index as a predictor of survival in non-small cell lung cancer patients undergoing immune checkpoint inhibition: a systematic review and meta-analysis

Zhang Y, Chen Y, Guo C, Li S, Huang C. Systemic immune-inflammation index as a predictor of survival in non-small cell lung cancer patients undergoing immune checkpoint inhibition: a systematic review and meta-analysis. Crit Rev Oncol Hematol. 2025;210:104669. doi:10.1016/j.critrevonc.2025.104669

work page doi:10.1016/j.critrevonc.2025.104669 2025

[23] [23]

Association of prognostic nutritional index with long-term survival in lung cancer receiving immune checkpoint inhibitors: a meta-analysis

Wang L, Long X, Zhu Y, et al. Association of prognostic nutritional index with long-term survival in lung cancer receiving immune checkpoint inhibitors: a meta-analysis. Medicine (Baltimore). 2024;103:e41087. doi:10.1097/MD.0000000000041087

work page doi:10.1097/md.0000000000041087 2024

[24] [24]

The D-dimer level predicts the prognosis in patients with lung cancer: a systematic review and meta-analysis

Ma M, Cao R, Wang W, et al. The D-dimer level predicts the prognosis in patients with lung cancer: a systematic review and meta-analysis. J Cardiothorac Surg. 2021;16:243. doi:10.1186/s13019-021- 01618-4

work page doi:10.1186/s13019-021- 2021

[25] [25]

Pretreatment lactate dehydrogenase may predict outcome of advanced non-small-cell lung cancer patients treated with immune checkpoint inhibitors: a meta-analysis

Zhang Z, Li Y, Yan X, Song Q, Wang G, Hu Y. Pretreatment lactate dehydrogenase may predict outcome of advanced non-small-cell lung cancer patients treated with immune checkpoint inhibitors: a meta-analysis. Cancer Med. 2019;8:1467-1473. doi:10.1002/cam4.2024

work page doi:10.1002/cam4.2024 2019

[26] [26]

Systemic immune-inflammation index is a promising noninvasive marker to predict survival of lung cancer: a meta-analysis

Zhang Y, Chen B, Wang L, Wang R, Yang X. Systemic immune-inflammation index is a promising noninvasive marker to predict survival of lung cancer: a meta-analysis. Medicine (Baltimore). 2019;98:e13788. doi:10.1097/MD.0000000000013788

work page doi:10.1097/md.0000000000013788 2019

[27] [27]

Prognostic value of the systemic immune-inflammation index in lung cancer patients receiving immune checkpoint inhibitors: a meta-analysis

Yang Y, Li J, Wang Y, et al. Prognostic value of the systemic immune-inflammation index in lung cancer patients receiving immune checkpoint inhibitors: a meta-analysis. PLoS One. 2024;19:e0312605. doi:10.1371/journal.pone.0312605

work page doi:10.1371/journal.pone.0312605 2024

[28] [28]

ESPEN guidelines on nutrition in cancer patients

Arends J, Bachmann P, Baracos V, et al. ESPEN guidelines on nutrition in cancer patients. Clin Nutr. 2017;36:11-48. doi:10.1016/j.clnu.2016.07.015

work page doi:10.1016/j.clnu.2016.07.015 2017

[29] [29]

Higher pretreatment lactate dehydrogenase concentration predicts worse overall survival in patients with lung cancer

Deng T, Zhang J, Meng Y, et al. Higher pretreatment lactate dehydrogenase concentration predicts worse overall survival in patients with lung cancer. Medicine (Baltimore). 2018;97:e12524. doi:10.1097/MD.0000000000012524

work page doi:10.1097/md.0000000000012524 2018

[30] [30]

Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): the TRIPOD statement

Collins GS, Reitsma JB, Altman DG, Moons KGM. Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): the TRIPOD statement. Ann Intern Med. 2015;162:55-63. doi:10.7326/M14-0697

work page doi:10.7326/m14-0697 2015

[31] [31]

Wolff, Karel G

Wolff RF, Moons KGM, Riley RD, et al. PROBAST: a tool to assess the risk of bias and applicability of prediction model studies. Ann Intern Med. 2019;170:51-58. doi:10.7326/M18-1376

work page doi:10.7326/m18-1376 2019

[32] [32]

Decision curve analysis: a novel method for evaluating prediction models

Vickers AJ, Elkin EB. Decision curve analysis: a novel method for evaluating prediction models. Med Decis Making. 2006;26:565-574. doi:10.1177/0272989X06295361

work page doi:10.1177/0272989x06295361 2006

[33] [33]

Time-dependent ROC curves for censored survival data and a diagnostic marker

Heagerty PJ, Lumley T, Pepe MS. Time-dependent ROC curves for censored survival data and a diagnostic marker. Biometrics. 2000;56:337-344. doi:10.1111/j.0006-341X.2000.00337.x

work page doi:10.1111/j.0006-341x.2000.00337.x 2000

[34] [34]

Extracting lung cancer staging descriptors from pathology reports: a generative language model approach

Cho H, Yoo S, Kim B, et al. Extracting lung cancer staging descriptors from pathology reports: a generative language model approach. J Biomed Inform. 2024;157:104720. doi:10.1016/j.jbi.2024.104720

work page doi:10.1016/j.jbi.2024.104720 2024

[35] [35]

Uncertainty-aware automatic TNM staging classification for [18F]FDG PET-CT reports for lung cancer utilising transformer-based language models and multi- task learning

Barlow SH, Chicklore S, He Y, et al. Uncertainty-aware automatic TNM staging classification for [18F]FDG PET-CT reports for lung cancer utilising transformer-based language models and multi- task learning. BMC Med Inform Decis Mak. 2024;24:396. doi:10.1186/s12911-024-02814-7

work page doi:10.1186/s12911-024-02814-7 2024

[36] [36]

Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach

Aerts HJWL, Velazquez ER, Leijenaar RTH, et al. Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach. Nat Commun. 2014;5:4006. doi:10.1038/ncomms5006

work page doi:10.1038/ncomms5006 2014

[37] [37]

Calibration: the Achilles heel of predictive analytics

Van Calster B, McLernon DJ, van Smeden M, et al. Calibration: the Achilles heel of predictive analytics. BMC Med. 2019;17:230. doi:10.1186/s12916-019-1466-7

work page doi:10.1186/s12916-019-1466-7 2019