pith. sign in

arxiv: 1906.10163 · v1 · pith:7BK3NW5Lnew · submitted 2019-06-24 · 📊 stat.AP

Assessing the Validity of a a priori Patient-Trial Generalizability Score using Real-world Data from a Large Clinical Data Research Network: A Colorectal Cancer Clinical Trial Case Study

Pith reviewed 2026-05-25 16:38 UTC · model grok-4.3

classification 📊 stat.AP
keywords patient-trial generalizabilitycolorectal cancerBevacizumabserious adverse eventselectronic health recordszero-inflated negative binomialclinical trial generalizabilityreal-world evidence
0
0 comments X

The pith

A score measuring how closely real-world patients match colorectal cancer trial criteria predicts fewer serious adverse events.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper evaluates an a priori composite patient-trial generalizability score derived from trial eligibility criteria for Bevacizumab in colorectal cancer. Using patient-level electronic health records from the OneFlorida network, the authors fit a zero-inflated negative binomial model that links this score to the observed count of serious adverse events. The results indicate that patients with higher similarity scores experience significantly fewer events. This supplies evidence that the score can forecast outcomes when the treatment moves from trial to routine care. A sympathetic reader would therefore view the score as a practical tool for gauging how well trial findings will translate outside the original study population.

Core claim

The study constructs a composite patient-trial generalizability (cPTG) score from the eligibility criteria of Bevacizumab colorectal cancer trials and applies it to real-world patients in the OneFlorida EHR consortium. Regression modeling shows that higher cPTG scores are associated with lower counts of serious adverse events, supporting the claim that patients more similar to the trial population encounter fewer safety issues when the treatment is used in practice.

What carries the argument

The composite patient-trial generalizability (cPTG) score, which aggregates similarity between an individual patient's characteristics and the eligibility criteria of the source trials.

If this is right

  • Higher cPTG scores correspond to lower numbers of serious adverse events in observational data.
  • Patients whose profiles align more closely with trial eligibility criteria face a reduced risk of serious adverse events.
  • The cPTG score functions as a valid a priori indicator of how well trial results will generalize.
  • Zero-inflated negative binomial regression can handle the excess zeros typical in adverse-event counts when testing such scores.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Routine calculation of the score from existing EHR fields could help clinicians anticipate which patients are more likely to experience complications.
  • The same score construction might be tested on other oncology drugs whose trial populations differ from everyday practice.
  • If the pattern replicates across diseases, eligibility criteria themselves could serve as built-in signals for expected real-world safety.

Load-bearing premise

The cPTG score captures every important way a patient can differ from the trial population, and the EHR records contain no unmeasured factors that affect both the score and the adverse-event count.

What would settle it

An independent cohort of colorectal cancer patients receiving Bevacizumab in which cPTG scores show no association with the number of serious adverse events.

read the original abstract

Existing trials had not taken enough consideration of their population representativeness, which can lower the effectiveness when the treatment is applied in real-world clinical practice. We analyzed the eligibility criteria of Bevacizumab colorectal cancer treatment trials, assessed their a priori generalizability, and examined how it affects patient outcomes when applied in real-world clinical settings. To do so, we extracted patient-level data from a large collection of electronic health records (EHRs) from the OneFlorida consortium. We built a zero-inflated negative binomial model using a composite patient-trial generalizability (cPTG) score to predict patients clinical outcomes (i.e., number of serious adverse events, (SAEs)). Our study results provide a body of evidence that 1) the cPTG scores can predict patient outcomes; and 2) patients who are more similar to the study population in the trials that were used to develop the treatment will have a significantly lower possibility to experience serious adverse events.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 1 minor

Summary. The paper extracts eligibility criteria from Bevacizumab colorectal cancer trials, constructs a composite patient-trial generalizability (cPTG) score, and fits a zero-inflated negative binomial regression on OneFlorida EHR data to test whether higher cPTG scores predict lower counts of serious adverse events (SAEs). The central claim is that the cPTG score is predictive and that patients more similar to the trial population experience significantly fewer SAEs in real-world data.

Significance. If the association survives adjustment for health-status confounders and is shown to be robust, the result would supply empirical support for using a priori generalizability scores to anticipate real-world performance of trial-derived treatments. The work sits at the intersection of trial design and observational pharmacoepidemiology; however, the current manuscript supplies neither effect sizes nor sensitivity analyses, so its immediate contribution remains limited.

major comments (3)
  1. [Abstract] Abstract (modeling approach): The zero-inflated negative binomial regression is presented with cPTG as the key predictor of SAE count, yet the description gives no indication that the model includes a rich set of baseline covariates (performance status, unrecorded comorbidities, socioeconomic factors) that could confound both eligibility match and SAE incidence. Without such adjustment or a negative-control/sensitivity analysis, the reported association is vulnerable to the unmeasured-confounding concern raised in the skeptic note.
  2. [Abstract] Abstract: No regression coefficients, incidence-rate ratios, confidence intervals, or model-fit statistics for the cPTG term are supplied, so it is impossible to judge the magnitude, precision, or practical importance of the claimed predictive relationship.
  3. [Abstract] Abstract: The manuscript asserts that the cPTG score is constructed a priori and that the regression is predictive rather than definitional, but provides no explicit statement that score construction and model selection were performed without reference to the same OneFlorida data later used for validation. This leaves open the possibility of circularity noted in the reader's assessment.
minor comments (1)
  1. [Abstract] Abstract: The phrase 'a a priori' contains a typographical error and should read 'an a priori'.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. We address each of the three major comments below and outline the revisions we will make to strengthen the paper.

read point-by-point responses
  1. Referee: [Abstract] Abstract (modeling approach): The zero-inflated negative binomial regression is presented with cPTG as the key predictor of SAE count, yet the description gives no indication that the model includes a rich set of baseline covariates (performance status, unrecorded comorbidities, socioeconomic factors) that could confound both eligibility match and SAE incidence. Without such adjustment or a negative-control/sensitivity analysis, the reported association is vulnerable to the unmeasured-confounding concern raised in the skeptic note.

    Authors: We agree that adjustment for additional baseline covariates would help address potential confounding. The current model focuses on the cPTG score as the primary predictor, but the OneFlorida EHR data contain several relevant variables (demographics, recorded comorbidities). In the revised manuscript we will refit the zero-inflated negative binomial model both with and without these covariates, report the corresponding incidence-rate ratios, and add sensitivity analyses to evaluate robustness to unmeasured confounding. revision: yes

  2. Referee: [Abstract] Abstract: No regression coefficients, incidence-rate ratios, confidence intervals, or model-fit statistics for the cPTG term are supplied, so it is impossible to judge the magnitude, precision, or practical importance of the claimed predictive relationship.

    Authors: The full regression output, including coefficients, incidence-rate ratios, 95% confidence intervals, and model-fit statistics for the cPTG term, appears in the Results section of the manuscript. To make these quantitative findings immediately accessible, we will add the key effect-size estimates and confidence intervals to the abstract in the revision. revision: yes

  3. Referee: [Abstract] Abstract: The manuscript asserts that the cPTG score is constructed a priori and that the regression is predictive rather than definitional, but provides no explicit statement that score construction and model selection were performed without reference to the same OneFlorida data later used for validation. This leaves open the possibility of circularity noted in the reader's assessment.

    Authors: The cPTG score is derived solely from the eligibility criteria published in the trial protocols; the OneFlorida data are used exclusively for the subsequent validation step. We will insert an explicit statement in the Methods section clarifying this temporal and data-source separation to remove any ambiguity about circularity. revision: yes

Circularity Check

0 steps flagged

No significant circularity; cPTG constructed a priori from eligibility criteria and tested on independent EHR outcomes

full rationale

The paper defines the composite patient-trial generalizability (cPTG) score from trial eligibility criteria before any outcome modeling, then applies the fixed score to OneFlorida EHR patients and fits a zero-inflated negative binomial regression of SAE count on that score. No equation or step reduces the reported association to a fit on the same data, a self-citation chain, or a definitional tautology; the regression coefficient is an empirical test on held-out real-world data and remains falsifiable. Self-citation of prior cPTG work, if present, supplies only the score definition and does not bear the load of the outcome association. This is a standard external-validation design with no circular reduction.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 1 invented entities

The central claim rests on the validity of the a priori cPTG score as a similarity measure and on the appropriateness of the zero-inflated negative binomial model for count data in observational EHR records.

free parameters (1)
  • ZINB regression coefficients and dispersion parameters
    Fitted to relate the cPTG score to the count of serious adverse events.
axioms (2)
  • domain assumption The composite patient-trial generalizability score validly quantifies similarity to trial eligibility criteria
    Used as the primary predictor without further validation in the abstract.
  • domain assumption The zero-inflated negative binomial distribution is an appropriate model for the distribution of serious adverse event counts
    Chosen as the statistical framework for the outcome variable.
invented entities (1)
  • cPTG score no independent evidence
    purpose: Composite a priori measure of patient similarity to trial population
    Constructed from eligibility criteria; no independent evidence of validity supplied in the abstract.

pith-pipeline@v0.9.0 · 5731 in / 1459 out tokens · 36455 ms · 2026-05-25T16:38:51.529970+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

33 extracted references · 33 canonical work pages

  1. [1]

    https://medlineplus.gov/magazine/issues/summer11/articles/summer11pg2-3.html

    From the NIH Director: The Importance of Clinical Trials | NIH MedlinePlus the Magazine. https://medlineplus.gov/magazine/issues/summer11/articles/summer11pg2-3.html. Accessed March 4, 2019

  2. [2]

    to whom do the results of this trial apply?

    Rothwell PM. External validity of randomised controlled trials: “to whom do the results of this trial apply?” Lancet Lond Engl. 2005;365(9453):82-93. doi:10.1016/S0140-6736(04)17670-8

  3. [3]

    External and internal validity in clinical trials

    Sedgwick P. External and internal validity in clinical trials. BMJ. 2012;344(feb16 1):e1004-e1004. doi:10.1136/bmj.e1004

  4. [4]

    The age gap between patients in clinical studies and in the general population: a pitfall for dementia research

    Schoenmaker N, Van Gool WA. The age gap between patients in clinical studies and in the general population: a pitfall for dementia research. Lancet Neurol. 2004;3(10):627-630. doi:10.1016/S1474-4422(04)00884-1

  5. [5]

    Underrepresentation of women, elderly patients, and racial minorities in the randomized trials used for cardiovascular guidelines

    Sardar MR, Badri M, Prince CT, Seltzer J, Kowey PR. Underrepresentation of women, elderly patients, and racial minorities in the randomized trials used for cardiovascular guidelines. JAMA Intern Med. 2014;174(11):1868-1870. doi:10.1001/jamainternmed.2014.4758

  6. [6]

    Battisti NML, Sehovic M, Extermann M. Assessment of the External Validity of the National Comprehensive Cancer Network and European Society for Medical Oncology Guidelines for Non -Small-Cell Lung Cancer in a Population of Patients Aged 80 Years and Older. Clin Lung Cancer. 2017;18(5):460-471. doi:10.1016/j.cllc.2017.03.005

  7. [7]

    Participation of older people in preauthorization trials of recently approved medicines

    Beers E, Moerkerken DC, Leufkens HGM, Egberts TCG, Jansen PAF. Participation of older people in preauthorization trials of recently approved medicines. J Am Geriatr Soc. 2014;62(10):1883-1890. doi:10.1111/jgs.13067

  8. [8]

    Barriers to inclusion of older adults in randomised controlled clinical trials on Non-Hodgkin’s lymphoma: a systematic review

    Bellera C, Praud D, Petit-Monéger A, McKelvie-Sebileau P, Soubeyran P, Mathoulin-Pélissier S. Barriers to inclusion of older adults in randomised controlled clinical trials on Non-Hodgkin’s lymphoma: a systematic review. Cancer Treat Rev. 2013;39(7):812-817. doi:10.1016/j.ctrv.2013.01.007

  9. [9]

    The older the better: are elderly study participants more non-representative? A cross-sectional analysis of clinical trial and observational study samples

    Golomb BA, Chan VT, Evans MA, Koperski S, White HL, Criqui MH. The older the better: are elderly study participants more non-representative? A cross-sectional analysis of clinical trial and observational study samples. BMJ Open. 2012;2(6). doi:10.1136/bmjopen-2012-000833

  10. [10]

    Participation of patients 65 years of age or older in cancer clinical trials

    Lewis JH, Kilgore ML, Goldman DP, et al. Participation of patients 65 years of age or older in cancer clinical trials. J Clin Oncol Off J Am Soc Clin Oncol. 2003;21(7):1383-1389. doi:10.1200/JCO.2003.08.010

  11. [11]

    Adverse drug event surveillance and drug withdrawals in the United States, 1969 - 2002: the importance of reporting suspected reactions

    Wysowski DK, Swartz L. Adverse drug event surveillance and drug withdrawals in the United States, 1969 - 2002: the importance of reporting suspected reactions. Arch Intern Med. 2005;165(12):1363-1369. doi:10.1001/archinte.165.12.1363

  12. [12]

    Assessment of generalisability in trials of health interventions: suggested framework and systematic review

    Bonell C, Oakley A, Hargreaves J, Strange V, Rees R. Assessment of generalisability in trials of health interventions: suggested framework and systematic review. BMJ. 2006;333(7563):346-349. doi:10.1136/bmj.333.7563.346

  13. [13]

    Exclusion criteria used in antidepressant efficacy trials: consistency across studies and representativeness of samples included

    Zimmerman M, Chelminski I, Posternak MA. Exclusion criteria used in antidepressant efficacy trials: consistency across studies and representativeness of samples included. J Nerv Ment Dis. 2004;192(2):87-94. doi:10.1097/01.nmd.0000110279.23893.82

  14. [14]

    Correlating eligibility criteria generalizability and adverse events using Big Data for patients and clinical trials

    Sen A, Ryan PB, Goldstein A, et al. Correlating eligibility criteria generalizability and adverse events using Big Data for patients and clinical trials. Ann N Y Acad Sci. 2017;1387(1):34-43. doi:10.1111/nyas.13195

  15. [15]

    A Distribution-based Method for Assessing The Differences between Clinical Trial Target Populations and Patient Populations in Electronic Health Records

    Weng C, Li Y, Ryan P, et al. A Distribution-based Method for Assessing The Differences between Clinical Trial Target Populations and Patient Populations in Electronic Health Records. Appl Clin Inform. 2014;5(2):463-479. doi:10.4338/ACI-2013-12-RA-0105

  16. [16]

    Simulation-based Evaluation of the Generalizability Index for Study Traits

    He Z, Chandar P, Ryan P, Weng C. Simulation-based Evaluation of the Generalizability Index for Study Traits. AMIA Annu Symp Proc AMIA Symp. 2015;2015:594-603

  17. [17]

    Multivariate analysis of the population representativeness of related clinical studies

    He Z, Ryan P, Hoxha J, et al. Multivariate analysis of the population representativeness of related clinical studies. J Biomed Inform. 2016;60:66-76. doi:10.1016/j.jbi.2016.01.007

  18. [18]

    GIST 2.0: A scalable multi -trait metric for quantifying population representativeness of individual clinical studies

    Sen A, Chakrabarti S, Goldstein A, Wang S, Ryan PB, Weng C. GIST 2.0: A scalable multi -trait metric for quantifying population representativeness of individual clinical studies. J Biomed Inform. 2016;63:325-336. doi:10.1016/j.jbi.2016.09.003

  19. [19]

    Food & Drug Administration

    U.S. Food & Drug Administration. FRAMEWORK FOR FDA’S REAL-WORLD EVIDENCE PROGRAM.; 2018:40. https://www.fda.gov/downloads/ScienceResearch/SpecialTopics/RealWorldEvidence/UCM627769.pdf. Accessed March 8, 2019

  20. [20]

    Real-World Evidence — What Is It and What Can It Tell Us? N Engl J Med

    Sherman RE, Anderson SA, Dal Pan GJ, et al. Real-World Evidence — What Is It and What Can It Tell Us? N Engl J Med. 2016;375(23):2293-2297. doi:10.1056/NEJMsb1609216

  21. [21]

    FDA drug approval summary: bevacizumab plus FOLFOX4 as second-line treatment of colorectal cancer

    Cohen MH, Gootenberg J, Keegan P, Pazdur R. FDA drug approval summary: bevacizumab plus FOLFOX4 as second-line treatment of colorectal cancer. The Oncologist. 2007;12(3):356-361. doi:10.1634/theoncologist.12- 3-356

  22. [22]

    OneFlorida Clinical Research Consortium: Linking a Clinical and Translational Science Institute With a Community-Based Distributive Medical Education Model

    Shenkman E, Hurt M, Hogan W, et al. OneFlorida Clinical Research Consortium: Linking a Clinical and Translational Science Institute With a Community-Based Distributive Medical Education Model. Acad Med J Assoc Am Med Coll. 2018;93(3):451-455. doi:10.1097/ACM.0000000000002029

  23. [23]

    OneFlorida Clinical Research Consortium

    Elizabeth Shenkman, William Hogan. OneFlorida Clinical Research Consortium. https://www.pcori.org/research-results/2015/oneflorida-clinical-research-consortium. Published January 25,

  24. [24]

    Accessed March 8, 2019

  25. [25]

    OneFlorida Deduper: Tools for EHR patient de-duplication (aka entity resolution)

    OneFlorida CRC. OneFlorida Deduper: Tools for EHR patient de-duplication (aka entity resolution). https://github.com/ufbmi/onefl-deduper. Published 2018. Accessed March 8, 2019

  26. [26]

    Lipori, et al

    Jiang Bian, Andrei Sura, Gloria P. Lipori, et al. Implementing a Hash-based Privacy-Preserving Entity Resolution Tool in the OneFlorida Clinical Data Research Network. In: ; 2017

  27. [27]

    Regression Analysis of Count Data

    Cameron AC, Trivedi PK. Regression Analysis of Count Data. Cambridge, UK ; New York, NY, USA: Cambridge University Press; 1998

  28. [28]

    Computable Eligibility Criteria through Ontology-driven Data Access: A Case Study of Hepatitis C Virus Trials

    Zhang H, He Z, He X, et al. Computable Eligibility Criteria through Ontology-driven Data Access: A Case Study of Hepatitis C Virus Trials. AMIA Annu Symp Proc AMIA Symp. 2018;2018:1601-1610

  29. [29]

    PCORnet Data

    PCORnet. PCORnet Data. https://pcornet.org/pcornet-data/. Published May 1, 2018. Accessed March 12, 2019

  30. [30]

    Increasing Trial Generalizability

    Gotay CC. Increasing Trial Generalizability. J Clin Oncol. 2006;24(6):846-847. doi:10.1200/JCO.2005.04.5120

  31. [31]

    Generalizability of common cancer clinical trial eligibility criteria in the real world

    Karim S, Xu Y, Quan ML, Dort JC, Bouchard-Fortier A, Cheung WY. Generalizability of common cancer clinical trial eligibility criteria in the real world. J Clin Oncol. 2018;36(15_suppl):e18616-e18616. doi:10.1200/JCO.2018.36.15_suppl.e18616

  32. [32]

    Broadening Eligibility Criteria to Make Clinical Trials More Representative: American Society of Clinical Oncology and Friends of Cancer Research Joint Research Statement

    Kim ES, Bruinooge SS, Roberts S, et al. Broadening Eligibility Criteria to Make Clinical Trials More Representative: American Society of Clinical Oncology and Friends of Cancer Research Joint Research Statement. J Clin Oncol. 2017;35(33):3737-3744. doi:10.1200/JCO.2017.73.7916

  33. [33]

    Inclusion/Exclusion Criteria for National Cancer Institute (NCI) Sponsored Clinical Trials

    NCI. Inclusion/Exclusion Criteria for National Cancer Institute (NCI) Sponsored Clinical Trials. https://ctep.cancer.gov/protocolDevelopment/docs/NCI_ASCO_Friends_Eligibility_Criteria.pdf. Published September 26, 2018. Accessed March 12, 2019