Assessing the Validity of a a priori Patient-Trial Generalizability Score using Real-world Data from a Large Clinical Data Research Network: A Colorectal Cancer Clinical Trial Case Study
Pith reviewed 2026-05-25 16:38 UTC · model grok-4.3
The pith
A score measuring how closely real-world patients match colorectal cancer trial criteria predicts fewer serious adverse events.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The study constructs a composite patient-trial generalizability (cPTG) score from the eligibility criteria of Bevacizumab colorectal cancer trials and applies it to real-world patients in the OneFlorida EHR consortium. Regression modeling shows that higher cPTG scores are associated with lower counts of serious adverse events, supporting the claim that patients more similar to the trial population encounter fewer safety issues when the treatment is used in practice.
What carries the argument
The composite patient-trial generalizability (cPTG) score, which aggregates similarity between an individual patient's characteristics and the eligibility criteria of the source trials.
If this is right
- Higher cPTG scores correspond to lower numbers of serious adverse events in observational data.
- Patients whose profiles align more closely with trial eligibility criteria face a reduced risk of serious adverse events.
- The cPTG score functions as a valid a priori indicator of how well trial results will generalize.
- Zero-inflated negative binomial regression can handle the excess zeros typical in adverse-event counts when testing such scores.
Where Pith is reading between the lines
- Routine calculation of the score from existing EHR fields could help clinicians anticipate which patients are more likely to experience complications.
- The same score construction might be tested on other oncology drugs whose trial populations differ from everyday practice.
- If the pattern replicates across diseases, eligibility criteria themselves could serve as built-in signals for expected real-world safety.
Load-bearing premise
The cPTG score captures every important way a patient can differ from the trial population, and the EHR records contain no unmeasured factors that affect both the score and the adverse-event count.
What would settle it
An independent cohort of colorectal cancer patients receiving Bevacizumab in which cPTG scores show no association with the number of serious adverse events.
read the original abstract
Existing trials had not taken enough consideration of their population representativeness, which can lower the effectiveness when the treatment is applied in real-world clinical practice. We analyzed the eligibility criteria of Bevacizumab colorectal cancer treatment trials, assessed their a priori generalizability, and examined how it affects patient outcomes when applied in real-world clinical settings. To do so, we extracted patient-level data from a large collection of electronic health records (EHRs) from the OneFlorida consortium. We built a zero-inflated negative binomial model using a composite patient-trial generalizability (cPTG) score to predict patients clinical outcomes (i.e., number of serious adverse events, (SAEs)). Our study results provide a body of evidence that 1) the cPTG scores can predict patient outcomes; and 2) patients who are more similar to the study population in the trials that were used to develop the treatment will have a significantly lower possibility to experience serious adverse events.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper extracts eligibility criteria from Bevacizumab colorectal cancer trials, constructs a composite patient-trial generalizability (cPTG) score, and fits a zero-inflated negative binomial regression on OneFlorida EHR data to test whether higher cPTG scores predict lower counts of serious adverse events (SAEs). The central claim is that the cPTG score is predictive and that patients more similar to the trial population experience significantly fewer SAEs in real-world data.
Significance. If the association survives adjustment for health-status confounders and is shown to be robust, the result would supply empirical support for using a priori generalizability scores to anticipate real-world performance of trial-derived treatments. The work sits at the intersection of trial design and observational pharmacoepidemiology; however, the current manuscript supplies neither effect sizes nor sensitivity analyses, so its immediate contribution remains limited.
major comments (3)
- [Abstract] Abstract (modeling approach): The zero-inflated negative binomial regression is presented with cPTG as the key predictor of SAE count, yet the description gives no indication that the model includes a rich set of baseline covariates (performance status, unrecorded comorbidities, socioeconomic factors) that could confound both eligibility match and SAE incidence. Without such adjustment or a negative-control/sensitivity analysis, the reported association is vulnerable to the unmeasured-confounding concern raised in the skeptic note.
- [Abstract] Abstract: No regression coefficients, incidence-rate ratios, confidence intervals, or model-fit statistics for the cPTG term are supplied, so it is impossible to judge the magnitude, precision, or practical importance of the claimed predictive relationship.
- [Abstract] Abstract: The manuscript asserts that the cPTG score is constructed a priori and that the regression is predictive rather than definitional, but provides no explicit statement that score construction and model selection were performed without reference to the same OneFlorida data later used for validation. This leaves open the possibility of circularity noted in the reader's assessment.
minor comments (1)
- [Abstract] Abstract: The phrase 'a a priori' contains a typographical error and should read 'an a priori'.
Simulated Author's Rebuttal
We thank the referee for the constructive comments on our manuscript. We address each of the three major comments below and outline the revisions we will make to strengthen the paper.
read point-by-point responses
-
Referee: [Abstract] Abstract (modeling approach): The zero-inflated negative binomial regression is presented with cPTG as the key predictor of SAE count, yet the description gives no indication that the model includes a rich set of baseline covariates (performance status, unrecorded comorbidities, socioeconomic factors) that could confound both eligibility match and SAE incidence. Without such adjustment or a negative-control/sensitivity analysis, the reported association is vulnerable to the unmeasured-confounding concern raised in the skeptic note.
Authors: We agree that adjustment for additional baseline covariates would help address potential confounding. The current model focuses on the cPTG score as the primary predictor, but the OneFlorida EHR data contain several relevant variables (demographics, recorded comorbidities). In the revised manuscript we will refit the zero-inflated negative binomial model both with and without these covariates, report the corresponding incidence-rate ratios, and add sensitivity analyses to evaluate robustness to unmeasured confounding. revision: yes
-
Referee: [Abstract] Abstract: No regression coefficients, incidence-rate ratios, confidence intervals, or model-fit statistics for the cPTG term are supplied, so it is impossible to judge the magnitude, precision, or practical importance of the claimed predictive relationship.
Authors: The full regression output, including coefficients, incidence-rate ratios, 95% confidence intervals, and model-fit statistics for the cPTG term, appears in the Results section of the manuscript. To make these quantitative findings immediately accessible, we will add the key effect-size estimates and confidence intervals to the abstract in the revision. revision: yes
-
Referee: [Abstract] Abstract: The manuscript asserts that the cPTG score is constructed a priori and that the regression is predictive rather than definitional, but provides no explicit statement that score construction and model selection were performed without reference to the same OneFlorida data later used for validation. This leaves open the possibility of circularity noted in the reader's assessment.
Authors: The cPTG score is derived solely from the eligibility criteria published in the trial protocols; the OneFlorida data are used exclusively for the subsequent validation step. We will insert an explicit statement in the Methods section clarifying this temporal and data-source separation to remove any ambiguity about circularity. revision: yes
Circularity Check
No significant circularity; cPTG constructed a priori from eligibility criteria and tested on independent EHR outcomes
full rationale
The paper defines the composite patient-trial generalizability (cPTG) score from trial eligibility criteria before any outcome modeling, then applies the fixed score to OneFlorida EHR patients and fits a zero-inflated negative binomial regression of SAE count on that score. No equation or step reduces the reported association to a fit on the same data, a self-citation chain, or a definitional tautology; the regression coefficient is an empirical test on held-out real-world data and remains falsifiable. Self-citation of prior cPTG work, if present, supplies only the score definition and does not bear the load of the outcome association. This is a standard external-validation design with no circular reduction.
Axiom & Free-Parameter Ledger
free parameters (1)
- ZINB regression coefficients and dispersion parameters
axioms (2)
- domain assumption The composite patient-trial generalizability score validly quantifies similarity to trial eligibility criteria
- domain assumption The zero-inflated negative binomial distribution is an appropriate model for the distribution of serious adverse event counts
invented entities (1)
-
cPTG score
no independent evidence
Reference graph
Works this paper leans on
-
[1]
https://medlineplus.gov/magazine/issues/summer11/articles/summer11pg2-3.html
From the NIH Director: The Importance of Clinical Trials | NIH MedlinePlus the Magazine. https://medlineplus.gov/magazine/issues/summer11/articles/summer11pg2-3.html. Accessed March 4, 2019
work page 2019
-
[2]
to whom do the results of this trial apply?
Rothwell PM. External validity of randomised controlled trials: “to whom do the results of this trial apply?” Lancet Lond Engl. 2005;365(9453):82-93. doi:10.1016/S0140-6736(04)17670-8
-
[3]
External and internal validity in clinical trials
Sedgwick P. External and internal validity in clinical trials. BMJ. 2012;344(feb16 1):e1004-e1004. doi:10.1136/bmj.e1004
-
[4]
Schoenmaker N, Van Gool WA. The age gap between patients in clinical studies and in the general population: a pitfall for dementia research. Lancet Neurol. 2004;3(10):627-630. doi:10.1016/S1474-4422(04)00884-1
-
[5]
Sardar MR, Badri M, Prince CT, Seltzer J, Kowey PR. Underrepresentation of women, elderly patients, and racial minorities in the randomized trials used for cardiovascular guidelines. JAMA Intern Med. 2014;174(11):1868-1870. doi:10.1001/jamainternmed.2014.4758
-
[6]
Battisti NML, Sehovic M, Extermann M. Assessment of the External Validity of the National Comprehensive Cancer Network and European Society for Medical Oncology Guidelines for Non -Small-Cell Lung Cancer in a Population of Patients Aged 80 Years and Older. Clin Lung Cancer. 2017;18(5):460-471. doi:10.1016/j.cllc.2017.03.005
-
[7]
Participation of older people in preauthorization trials of recently approved medicines
Beers E, Moerkerken DC, Leufkens HGM, Egberts TCG, Jansen PAF. Participation of older people in preauthorization trials of recently approved medicines. J Am Geriatr Soc. 2014;62(10):1883-1890. doi:10.1111/jgs.13067
-
[8]
Bellera C, Praud D, Petit-Monéger A, McKelvie-Sebileau P, Soubeyran P, Mathoulin-Pélissier S. Barriers to inclusion of older adults in randomised controlled clinical trials on Non-Hodgkin’s lymphoma: a systematic review. Cancer Treat Rev. 2013;39(7):812-817. doi:10.1016/j.ctrv.2013.01.007
-
[9]
Golomb BA, Chan VT, Evans MA, Koperski S, White HL, Criqui MH. The older the better: are elderly study participants more non-representative? A cross-sectional analysis of clinical trial and observational study samples. BMJ Open. 2012;2(6). doi:10.1136/bmjopen-2012-000833
-
[10]
Participation of patients 65 years of age or older in cancer clinical trials
Lewis JH, Kilgore ML, Goldman DP, et al. Participation of patients 65 years of age or older in cancer clinical trials. J Clin Oncol Off J Am Soc Clin Oncol. 2003;21(7):1383-1389. doi:10.1200/JCO.2003.08.010
-
[11]
Wysowski DK, Swartz L. Adverse drug event surveillance and drug withdrawals in the United States, 1969 - 2002: the importance of reporting suspected reactions. Arch Intern Med. 2005;165(12):1363-1369. doi:10.1001/archinte.165.12.1363
-
[12]
Bonell C, Oakley A, Hargreaves J, Strange V, Rees R. Assessment of generalisability in trials of health interventions: suggested framework and systematic review. BMJ. 2006;333(7563):346-349. doi:10.1136/bmj.333.7563.346
-
[13]
Zimmerman M, Chelminski I, Posternak MA. Exclusion criteria used in antidepressant efficacy trials: consistency across studies and representativeness of samples included. J Nerv Ment Dis. 2004;192(2):87-94. doi:10.1097/01.nmd.0000110279.23893.82
-
[14]
Sen A, Ryan PB, Goldstein A, et al. Correlating eligibility criteria generalizability and adverse events using Big Data for patients and clinical trials. Ann N Y Acad Sci. 2017;1387(1):34-43. doi:10.1111/nyas.13195
-
[15]
Weng C, Li Y, Ryan P, et al. A Distribution-based Method for Assessing The Differences between Clinical Trial Target Populations and Patient Populations in Electronic Health Records. Appl Clin Inform. 2014;5(2):463-479. doi:10.4338/ACI-2013-12-RA-0105
-
[16]
Simulation-based Evaluation of the Generalizability Index for Study Traits
He Z, Chandar P, Ryan P, Weng C. Simulation-based Evaluation of the Generalizability Index for Study Traits. AMIA Annu Symp Proc AMIA Symp. 2015;2015:594-603
work page 2015
-
[17]
Multivariate analysis of the population representativeness of related clinical studies
He Z, Ryan P, Hoxha J, et al. Multivariate analysis of the population representativeness of related clinical studies. J Biomed Inform. 2016;60:66-76. doi:10.1016/j.jbi.2016.01.007
-
[18]
Sen A, Chakrabarti S, Goldstein A, Wang S, Ryan PB, Weng C. GIST 2.0: A scalable multi -trait metric for quantifying population representativeness of individual clinical studies. J Biomed Inform. 2016;63:325-336. doi:10.1016/j.jbi.2016.09.003
-
[19]
U.S. Food & Drug Administration. FRAMEWORK FOR FDA’S REAL-WORLD EVIDENCE PROGRAM.; 2018:40. https://www.fda.gov/downloads/ScienceResearch/SpecialTopics/RealWorldEvidence/UCM627769.pdf. Accessed March 8, 2019
work page 2018
-
[20]
Real-World Evidence — What Is It and What Can It Tell Us? N Engl J Med
Sherman RE, Anderson SA, Dal Pan GJ, et al. Real-World Evidence — What Is It and What Can It Tell Us? N Engl J Med. 2016;375(23):2293-2297. doi:10.1056/NEJMsb1609216
-
[21]
FDA drug approval summary: bevacizumab plus FOLFOX4 as second-line treatment of colorectal cancer
Cohen MH, Gootenberg J, Keegan P, Pazdur R. FDA drug approval summary: bevacizumab plus FOLFOX4 as second-line treatment of colorectal cancer. The Oncologist. 2007;12(3):356-361. doi:10.1634/theoncologist.12- 3-356
-
[22]
Shenkman E, Hurt M, Hogan W, et al. OneFlorida Clinical Research Consortium: Linking a Clinical and Translational Science Institute With a Community-Based Distributive Medical Education Model. Acad Med J Assoc Am Med Coll. 2018;93(3):451-455. doi:10.1097/ACM.0000000000002029
-
[23]
OneFlorida Clinical Research Consortium
Elizabeth Shenkman, William Hogan. OneFlorida Clinical Research Consortium. https://www.pcori.org/research-results/2015/oneflorida-clinical-research-consortium. Published January 25,
work page 2015
-
[24]
Accessed March 8, 2019
work page 2019
-
[25]
OneFlorida Deduper: Tools for EHR patient de-duplication (aka entity resolution)
OneFlorida CRC. OneFlorida Deduper: Tools for EHR patient de-duplication (aka entity resolution). https://github.com/ufbmi/onefl-deduper. Published 2018. Accessed March 8, 2019
work page 2018
-
[26]
Jiang Bian, Andrei Sura, Gloria P. Lipori, et al. Implementing a Hash-based Privacy-Preserving Entity Resolution Tool in the OneFlorida Clinical Data Research Network. In: ; 2017
work page 2017
-
[27]
Regression Analysis of Count Data
Cameron AC, Trivedi PK. Regression Analysis of Count Data. Cambridge, UK ; New York, NY, USA: Cambridge University Press; 1998
work page 1998
-
[28]
Zhang H, He Z, He X, et al. Computable Eligibility Criteria through Ontology-driven Data Access: A Case Study of Hepatitis C Virus Trials. AMIA Annu Symp Proc AMIA Symp. 2018;2018:1601-1610
work page 2018
-
[29]
PCORnet. PCORnet Data. https://pcornet.org/pcornet-data/. Published May 1, 2018. Accessed March 12, 2019
work page 2018
-
[30]
Increasing Trial Generalizability
Gotay CC. Increasing Trial Generalizability. J Clin Oncol. 2006;24(6):846-847. doi:10.1200/JCO.2005.04.5120
-
[31]
Generalizability of common cancer clinical trial eligibility criteria in the real world
Karim S, Xu Y, Quan ML, Dort JC, Bouchard-Fortier A, Cheung WY. Generalizability of common cancer clinical trial eligibility criteria in the real world. J Clin Oncol. 2018;36(15_suppl):e18616-e18616. doi:10.1200/JCO.2018.36.15_suppl.e18616
-
[32]
Kim ES, Bruinooge SS, Roberts S, et al. Broadening Eligibility Criteria to Make Clinical Trials More Representative: American Society of Clinical Oncology and Friends of Cancer Research Joint Research Statement. J Clin Oncol. 2017;35(33):3737-3744. doi:10.1200/JCO.2017.73.7916
-
[33]
Inclusion/Exclusion Criteria for National Cancer Institute (NCI) Sponsored Clinical Trials
NCI. Inclusion/Exclusion Criteria for National Cancer Institute (NCI) Sponsored Clinical Trials. https://ctep.cancer.gov/protocolDevelopment/docs/NCI_ASCO_Friends_Eligibility_Criteria.pdf. Published September 26, 2018. Accessed March 12, 2019
work page 2018
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.