Why decision curves go above or below treat-all and treat-none: a PPV- and calibration-based guide for clinical prediction models

Linard Hoessly

arxiv: 2603.26184 · v2 · pith:LVLF2ZSOnew · submitted 2026-03-27 · 📊 stat.AP

Why decision curves go above or below treat-all and treat-none: a PPV- and calibration-based guide for clinical prediction models

Linard Hoessly This is my paper

Pith reviewed 2026-05-21 09:39 UTC · model grok-4.3

classification 📊 stat.AP

keywords decision curvesnet benefitpositive predictive valuemodel calibrationclinical prediction modelstreat-alltreat-nonerisk threshold

0 comments

The pith

Net benefit comparisons to treat-all and treat-none reduce to threshold-specific observed risks, connecting decision curves to subgroup calibration and positive predictive value.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops two practical interpretations of net benefit to help clinicians understand decision curves. It shows that a curve sits above or below the treat-none and treat-all lines exactly when the observed risk in patients above the chosen threshold differs from the threshold value itself. This directly ties curve performance to how well the model is calibrated inside the treated and untreated subgroups. The authors also rewrite net benefit in terms of positive predictive value, which clarifies when acting on a prediction improves decisions over simpler strategies. They conclude by recommending positive predictive value curves as a direct companion plot to standard decision curves.

Core claim

Comparisons with treat-none and treat-all can be expressed through threshold-specific observed risk in patients above and below the decision threshold, linking decision-curve performance to calibration in clinically relevant subgroups. Net benefit also relates to positive predictive value, offering a more intuitive explanation of when acting on model predictions is justified. The derivations are illustrated and positive predictive value curves are proposed as a practical complement to decision curves.

What carries the argument

Threshold-specific observed risk above and below the decision threshold, together with its algebraic link to positive predictive value.

If this is right

A model that is well calibrated among patients above the threshold will produce a decision curve above the treat-none line.
Net benefit becomes positive when positive predictive value at the threshold exceeds the harm-to-benefit ratio of treatment.
Positive predictive value curves supply an alternative visual check on the same information that decision curves display.
Poor calibration in the high-risk subgroup directly lowers or eliminates the apparent advantage of the model over treat-none.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

This framing could guide targeted recalibration efforts focused only on the risk range where decisions are actually made.
Clinicians might choose thresholds by inspecting positive predictive value rather than net benefit alone.
The same observed-risk decomposition might apply to other threshold-based decision metrics beyond net benefit.

Load-bearing premise

That threshold-specific observed risks and positive predictive values directly represent clinical utility without further conditions on data quality or population traits.

What would settle it

A dataset in which the net benefit value computed at a threshold fails to match the value obtained from the observed risk among patients whose predicted risk exceeds that threshold.

Figures

Figures reproduced from arXiv: 2603.26184 by Linard Hoessly.

**Figure 4.** Figure 4: a comparatively rich and a very simple logistic regression model [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 1.** Figure 1: Net benefit curve with corresponding PPV curve for GUSTO-I. 2.4. Analytical arguments on calibration and net benefit. We review the observations that miscalibration can substantially reduce clinical utility and may even lead to clinical harm, i.e. net benefit below the treat-all or treat-none strategies [4].Ttwo failure modes were highlighted: systematic overestimation can yield [PITH_FULL_IMAGE:figures/… view at source ↗

**Figure 2.** Figure 2: Net benefit curve with corresponding PPV curve for SUPPORT. NB(t) < 0 for thresholds t > I (worse than treat-none), whereas systematic underestimation can yield NB(t) < NBall(t) for thresholds t < I (worse than treat-all). Both effects can be explained by the observations in Section 2.2. For convenience we briefly go through the arguments below. Overestimation. If risks are systematically overestimated, s… view at source ↗

**Figure 3.** Figure 3 [PITH_FULL_IMAGE:figures/full_fig_p014_3.png] view at source ↗

**Figure 4.** Figure 4: Net benefit curve with corresponding PPV curve for SUPPORT. Appendix D. Mathematical derivations for calibration D.1. Better than treat-none. Let st > 0, then Y¯≥t := T P(t)/(T P(t) + F P(t)) and T P(t) n = st Y¯≥t, F P(t) n = st (1 − Y¯≥t). Substituting into NB(t) = T P (t) n − t 1−t F P (t) n yields NB(t) = stY¯≥t − t 1 − t st(1 − Y¯≥t) = st 1 − t [PITH_FULL_IMAGE:figures/full_fig_p015_4.png] view at source ↗

read the original abstract

Net benefit is widely used and reported to evaluate the clinical utility of prediction models, yet its interpretation often remains difficult in practice. In this didactical note, we develop two complementary interpretations that make net benefit easier to understand for clinical audiences. We show that comparisons with treat-none and treat-all can be expressed through threshold-specific observed risk in patients above and below the decision threshold, linking decision-curve performance to calibration in clinically relevant subgroups. We also show how net benefit relates to positive predictive value, offering a more intuitive explanation of when acting on model predictions is justified. We derive and illustrate these results and propose positive predictive value curves as a practical complement to decision curves.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This note re-expresses decision-curve net benefit via observed risks above/below threshold and via PPV, but the relations are definitional identities from the standard formula.

read the letter

The one or two things to know about this paper are that it shows how decision curve comparisons with treat-all and treat-none boil down to threshold-specific observed risks in the relevant patient groups, linking to calibration, and that net benefit relates to PPV for a more intuitive view. It does a good job presenting these derivations clearly for clinical readers and suggests PPV curves as a practical tool alongside decision curves. The illustrations should help if they are as straightforward as the abstract indicates. Where it is softer is that these are algebraic identities based on the usual net benefit setup. They are true by definition for any data, which means the paper's strength is explanatory rather than advancing new methods or findings. No major gaps in the logic, but the scope is narrow. This is aimed at people using clinical prediction models who find standard decision curve outputs hard to interpret. A reader interested in practical guidance will find value, while those seeking original research might pass. It deserves a serious referee because better explanations can make existing tools more usable. Recommendation: yes, send it for peer review to see if it improves understanding in the field.

Referee Report

0 major / 3 minor

Summary. The paper is a didactical note deriving two complementary interpretations of net benefit in decision-curve analysis for clinical prediction models. It shows that net-benefit comparisons against treat-all and treat-none can be algebraically re-expressed using the empirical event rate (observed risk) among patients whose predicted probability exceeds the decision threshold (and symmetrically below it), thereby linking decision-curve performance to calibration within clinically relevant subgroups. It further relates net benefit at a given threshold to the positive predictive value of the model at that threshold. The derivations are illustrated and the authors propose PPV curves as a practical complement to decision curves.

Significance. If the algebraic identities hold, the manuscript offers a useful pedagogical contribution by grounding the interpretation of decision curves in observable quantities (subgroup calibration and PPV) rather than the abstract net-benefit formula alone. The derivations are parameter-free and follow directly from substitution into the standard net-benefit expression, so they hold for any fixed threshold and any joint distribution of predictions and outcomes. This strengthens the practical teaching and application of decision-curve analysis without introducing new empirical claims or assumptions about data quality.

minor comments (3)

The manuscript would benefit from an explicit statement early in the derivations section confirming that the re-expressions are identities that hold by definition once the standard net-benefit formula is substituted, to avoid any impression of additional modeling assumptions.
Figure legends and axis labels for the proposed PPV curves should more clearly distinguish them from the conventional decision curves and indicate whether the PPV axis is plotted on the same probability scale.
A brief note on the handling of ties or continuous versus discrete thresholds would clarify the practical computation of the threshold-specific observed risks.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive and constructive assessment of our didactical note. We appreciate the recognition that the algebraic links between net benefit, subgroup calibration, and positive predictive value offer a useful pedagogical contribution to decision-curve analysis. No specific major comments were raised in the report, so we have no points to address individually. We will incorporate any minor editorial suggestions during revision.

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper's central derivations consist of algebraic re-expressions that substitute the standard net-benefit formula into expressions involving threshold-specific observed risks and positive predictive value. These identities hold by definition for any fixed threshold and any joint distribution of predictions and outcomes, without fitted parameters, self-referential equations, or load-bearing self-citations. The derivations are self-contained, drawing only on prior standard definitions of net benefit, PPV, and calibration rather than reducing to the paper's own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Based solely on the abstract, the paper relies on standard statistical definitions of net benefit, PPV, and calibration without introducing new free parameters, axioms beyond basic probability, or invented entities.

axioms (1)

domain assumption Standard definitions of positive predictive value and calibration as functions of predicted and observed risks hold in the relevant patient subgroups.
Invoked implicitly when linking decision curve position to threshold-specific observed risk and PPV.

pith-pipeline@v0.9.0 · 5640 in / 1292 out tokens · 45724 ms · 2026-05-21T09:39:01.858388+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

NB(t) = st/(1-t) (¯Y≥t − t) and NB(t) > 0 ⇔ PPV(t) > t

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

40 extracted references · 40 canonical work pages

[1]

Understanding diagnostic tests 1: sensitivity, specificity and predictive values.Acta Paediatrica, 96(3):338?341, February 2007

Anthony K Akobeng. Understanding diagnostic tests 1: sensitivity, specificity and predictive values.Acta Paediatrica, 96(3):338?341, February 2007

work page 2007
[2]

Statistics notes: Diagnostic tests 2: predictive values

Douglas G Altman and J Martin Bland. Statistics notes: Diagnostic tests 2: predictive values. BMJ, 309(6947):102, 1994

work page 1994
[3]

Collins, Andrew J

Ben Van Calster, Gary S. Collins, Andrew J. Vickers, Laure Wynants, Kathleen F. Kerr, Lasai Barrenada, Gael Varoquaux, Karandeep Singh, Karel G. M. Moons, Tina Hernandez- boussard, Dirk Timmerman, David J. Mclernon, Maarten Van Smeden, and Ewout W. Steyer- berg. Performance evaluation of predictive ai models to support medical decisions: Overview and guid...

work page 2024
[4]

Ben Van Calster and Andrew J. Vickers. Calibration of risk prediction models: Impact on decision-analytic performance.Medical Decision Making, 35(2):162–169, 2015. PMID: 25155798

work page 2015
[5]

G. S. Collins and D. G. Altman. Predicting the 10 year risk of cardiovascular disease in the united kingdom: independent and external validation of an updated version of qrisk2.BMJ, 344(jun21 1):e4181?e4181, June 2012

work page 2012
[6]

Tripod+ai statement: updated guidance for reporting clinical prediction models that use regression or machine learning methods.BMJ, page e078378, April 2024

Gary S Collins, Karel G M Moons, Paula Dhiman, Richard D Riley, Andrew L Beam, Ben Van Calster, Marzyeh Ghassemi, Xiaoxuan Liu, Johannes B Reitsma, Maarten van Smeden, Anne-Laure Boulesteix, Jennifer Catherine Camaradou, Leo Anthony Celi, Spiros Denaxas, Alastair K Denniston, Ben Glocker, Robert M Golub, Hugh Harvey, Georg Heinze, Michael M Hoffman, Andre...

work page 2024
[7]

Collins, Johannes B

Gary S. Collins, Johannes B. Reitsma, Douglas G. Altman, and Karel G.M. Moons. Trans- parent reporting of a multivariable prediction model for individual prognosis or diagnosis (tripod): The tripod statement.Annals of Internal Medicine, 162(1):55?63, January 2015

work page 2015
[8]

Transparent reporting of multivariable pre- diction models developed or validated using clustered data: Tripod-cluster checklist.BMJ, 380:e071018, February 2023

Thomas P A Debray, Gary S Collins, Richard D Riley, Kym I E Snell, Ben Van Calster, Johannes B Reitsma, and Karel G M Moons. Transparent reporting of multivariable pre- diction models developed or validated using clustered data: Tripod-cluster checklist.BMJ, 380:e071018, February 2023

work page 2023
[9]

Georgii.Stochastics: Introduction to Probability and Statistics

H.O. Georgii.Stochastics: Introduction to Probability and Statistics. De Gruyter textbook. Walter De Gruyter, 2008

work page 2008
[10]

Strictly proper scoring rules, prediction, and esti- mation.Journal of the American Statistical Association, 102(477):359–378, 2007

Tilmann Gneiting and Adrian E Raftery. Strictly proper scoring rules, prediction, and esti- mation.Journal of the American Statistical Association, 102(477):359–378, 2007

work page 2007
[11]

David J. Hand. Assessing the performance of classification methods.International Statistical Review, 80(3):400–414, 2012

work page 2012
[12]

Harrell.Regression Modeling Strategies: With Applications to Linear Models, Logis- tic and Ordinal Regression, and Survival Analysis

F.E. Harrell.Regression Modeling Strategies: With Applications to Linear Models, Logis- tic and Ordinal Regression, and Survival Analysis. Springer Series in Statistics. Springer International Publishing, 2015

work page 2015
[13]

R package version 5.2-3

Frank E Harrell Jr.Hmisc: Harrell Miscellaneous, 2025. R package version 5.2-3. 12 LINARD HOESSLY

work page 2025
[14]

On misconceptions about the brier score in binary prediction models.Global Epidemiology, 11:100242, June 2026

Linard Hoessly. On misconceptions about the brier score in binary prediction models.Global Epidemiology, 11:100242, June 2026

work page 2026
[15]

How to evaluate probabilistic prediction models: Key metrics.Journal of Clinical Epidemiology, page 112247, March 2026

Linard Hoessly and Matthew Parry. How to evaluate probabilistic prediction models: Key metrics.Journal of Clinical Epidemiology, page 112247, March 2026

work page 2026
[16]

Kerr, Marshall D

Kathleen F. Kerr, Marshall D. Brown, Kehao Zhu, and Holly Janes. Assessing the clinical impact of risk prediction models with decision curves: Guidance for correct interpretation and appropriate use.Journal of Clinical Oncology, 34(21):2534?2540, July 2016

work page 2016
[17]

Knaus, Frank E

William A. Knaus, Frank E. Harrell, Joanne Lynn, Lee Goldman, Russell S. Phillips, Alfred F. Connors, Neal V. Dawson, William J. Fulkerson, Robert M. Califf, Norman Desbiens, Peter Layde, Robert K. Oye, Paul E. Bellamy, Rosemarie B. Hakim, and Douglas P. Wagner. The support prognostic model: Objective estimates of survival for seriously ill hospitalized a...

work page 1995
[18]

Kohn and Thomas B

Michael A. Kohn and Thomas B. Newman. Visualizing the value of diagnostic tests and prediction models, part ii. net benefit graphs: net benefit as a function of the exchange rate. Journal of Clinical Epidemiology, 181:111690, May 2025

work page 2025
[19]

Lee, Lynn H

Kerry L. Lee, Lynn H. Woodlief, Eric J. Topol, W. Douglas Weaver, Amadeo Betriu, Jacques Col, Maarten Simoons, Phil Aylward, Frans Van de Werf, and Robert M. Califf. Predictors of 30-day mortality in the era of reperfusion for acute myocardial infarction: Results from an international trial of 41 021 patients.Circulation, 91(6):1659?1668, March 1995

work page 1995
[20]

Springer International Publishing, December 2021

Hendrik-Jan Mijderwijk and Daan Nieboer.Is My Clinical Prediction Model Clinically Use- ful? A Primer on Decision Curve Analysis, page 115?118. Springer International Publishing, December 2021

work page 2021
[21]

Pauker and Jerome P

Stephen G. Pauker and Jerome P. Kassirer. Therapeutic decision making: A cost-benefit analysis.New England Journal of Medicine, 293(5):229?234, July 1975

work page 1975
[22]

Pepe, Jing Fan, Ziding Feng, Thomas Gerds, and Jorgen Hilden

Margaret S. Pepe, Jing Fan, Ziding Feng, Thomas Gerds, and Jorgen Hilden. The net reclassi- fication index (nri): A misleading measure of prediction improvement even with independent test data sets.Statistics in Biosciences, 7(2):282?295, August 2014

work page 2014
[23]

Reilly and Arthur T

Brendan M. Reilly and Arthur T. Evans. Translating clinical research into clinical prac- tice: Impact of using prediction rules to make decisions.Annals of Internal Medicine, 144(3):201?209, February 2006

work page 2006
[24]

How to measure the quality of credit scoring models

Martin Rez´ aˇ c and Frantiˇ sek Rez´ aˇ c. How to measure the quality of credit scoring models. Finance a Uver: Czech Journal of Economics & Finance, 61(5), 2011

work page 2011
[25]

Valentin Rousson and Thomas Zumbrunn. Decision curve analysis revisited: overall net ben- efit, relationships to roc curve analysis, and application to case-control studies.BMC Medical Informatics and Decision Making, 11(1), June 2011

work page 2011
[26]

Use of brier score to assess binary predictions.Journal of Clinical Epidemi- ology, 63(8):938?939, August 2010

Kaspar Rufibach. Use of brier score to assess binary predictions.Journal of Clinical Epidemi- ology, 63(8):938?939, August 2010

work page 2010
[27]

Sjoberg.dcurves: Decision Curve Analysis for Model Evaluation, 2024

Daniel D. Sjoberg.dcurves: Decision Curve Analysis for Model Evaluation, 2024. R package version 0.5.0

work page 2024
[28]

Kym I E Snell, Brooke Levis, Johanna A A Damen, Paula Dhiman, Thomas P A Debray, Lotty Hooft, Johannes B Reitsma, Karel G M Moons, Gary S Collins, and Richard D Riley. Transparent reporting of multivariable prediction models for individual prognosis or diagnosis: checklist for systematic reviews and meta-analyses (tripod-srma).BMJ, 381:e073538, May 2023

work page 2023
[29]

T Sorahan and M S Gilthorpe. Non-differential misclassification of exposure always leads to an underestimate of risk: an incorrect conclusion.Occupational and Environmental Medicine, 51(12):839?840, December 1994

work page 1994
[30]

Steyerberg.Clinical Prediction Models: A Practical Approach to Development, Vali- dation, and Updating

E.W. Steyerberg.Clinical Prediction Models: A Practical Approach to Development, Vali- dation, and Updating. Statistics for Biology and Health. Springer International Publishing, 2019

work page 2019
[31]

Assessing the performance of prediction models: a framework for traditional and novel measures.Epidemiology, 21(1):128– 138, January 2010

Ewout W Steyerberg, Andrew J Vickers, Nancy R Cook, Thomas Gerds, Mithat Gonen, Nancy Obuchowski, Michael J Pencina, and Michael W Kattan. Assessing the performance of prediction models: a framework for traditional and novel measures.Epidemiology, 21(1):128– 138, January 2010

work page 2010
[32]

Using the weighted area under the net benefit curve for decision curve analysis.BMC Medical Informatics and Decision Making, 16(1), July 2016

Rajesh Talluri and Sanjay Shete. Using the weighted area under the net benefit curve for decision curve analysis.BMC Medical Informatics and Decision Making, 16(1), July 2016. DECISION CUR VES, PPV AND CALIBRATION 13

work page 2016
[33]

Verbeek, Jan Y

Ben Van Calster, Laure Wynants, Jan F.M. Verbeek, Jan Y. Verbakel, Evangelia Christodoulou, Andrew J. Vickers, Monique J. Roobol, and Ewout W. Steyerberg. Report- ing and interpreting decision curve analysis: A guide for investigators.European Urology, 74(6):796?804, December 2018

work page 2018
[34]

Verbakel, Ewout W

Jan Y. Verbakel, Ewout W. Steyerberg, Hajime Uno, Bavo De Cock, Laure Wynants, Gary S. Collins, and Ben Van Calster. Roc curves for clinical prediction models part 1. roc plots showed no added value above the auc when evaluating the performance of clinical prediction models.Journal of Clinical Epidemiology, 126:207?216, October 2020

work page 2020
[35]

Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers.The American Statistician, 62(4):314?320, November 2008

Andrew J Vickers. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers.The American Statistician, 62(4):314?320, November 2008

work page 2008
[36]

Vickers and Elena B

Andrew J. Vickers and Elena B. Elkin. Decision curve analysis: A novel method for evaluating prediction models.Medical Decision Making, 26(6):565?574, November 2006

work page 2006
[37]

Net benefit approaches to the evaluation of prediction models, molecular markers, and diagnostic tests.BMJ, page i6, January 2016

Andrew J Vickers, Ben Van Calster, and Ewout W Steyerberg. Net benefit approaches to the evaluation of prediction models, molecular markers, and diagnostic tests.BMJ, page i6, January 2016

work page 2016
[38]

Vickers, Ben van Calster, and Ewout W

Andrew J. Vickers, Ben van Calster, and Ewout W. Steyerberg. A simple, step-by-step guide to interpreting decision curve analysis.Diagnostic and Prognostic Research, 3(1), October 2019

work page 2019
[39]

Framework for the impact anal- ysis and implementation of clinical prediction rules (cprs).BMC Medical Informatics and Decision Making, 11(1), October 2011

Emma Wallace, Susan M Smith, Rafael Perera-Salazar, Paul Vaucher, Colin McCowan, Gary Collins, Jan Verbakel, Monica Lakhanpaul, and Tom Fahey. Framework for the impact anal- ysis and implementation of clinical prediction rules (cprs).BMC Medical Informatics and Decision Making, 11(1), October 2011

work page 2011
[40]

max ( N B(t) N B(t)+ 1 t (I−N B(t)) (1−t) +t, N B(t) N B(t)+ 1 1−t (1−I) (1−t) +t ) ,1 # ,ifN B(t)>0, {0, t},ifN B(t) = 0,

Qian M. Zhou, Lu Zhe, Russell J. Brooke, Melissa M. Hudson, and Yan Yuan. A relationship between the incremental values of area under the ROC curve and of area under the precision- recall curve.Diagnostic and Prognostic Research, 5(1):13, July 2021. AppendixA.Mathematical derivations A.1.Bounds on PPV implied by net benefit.For a fixed incidenceI, the fra...

work page 2021

[1] [1]

Understanding diagnostic tests 1: sensitivity, specificity and predictive values.Acta Paediatrica, 96(3):338?341, February 2007

Anthony K Akobeng. Understanding diagnostic tests 1: sensitivity, specificity and predictive values.Acta Paediatrica, 96(3):338?341, February 2007

work page 2007

[2] [2]

Statistics notes: Diagnostic tests 2: predictive values

Douglas G Altman and J Martin Bland. Statistics notes: Diagnostic tests 2: predictive values. BMJ, 309(6947):102, 1994

work page 1994

[3] [3]

Collins, Andrew J

Ben Van Calster, Gary S. Collins, Andrew J. Vickers, Laure Wynants, Kathleen F. Kerr, Lasai Barrenada, Gael Varoquaux, Karandeep Singh, Karel G. M. Moons, Tina Hernandez- boussard, Dirk Timmerman, David J. Mclernon, Maarten Van Smeden, and Ewout W. Steyer- berg. Performance evaluation of predictive ai models to support medical decisions: Overview and guid...

work page 2024

[4] [4]

Ben Van Calster and Andrew J. Vickers. Calibration of risk prediction models: Impact on decision-analytic performance.Medical Decision Making, 35(2):162–169, 2015. PMID: 25155798

work page 2015

[5] [5]

G. S. Collins and D. G. Altman. Predicting the 10 year risk of cardiovascular disease in the united kingdom: independent and external validation of an updated version of qrisk2.BMJ, 344(jun21 1):e4181?e4181, June 2012

work page 2012

[6] [6]

Tripod+ai statement: updated guidance for reporting clinical prediction models that use regression or machine learning methods.BMJ, page e078378, April 2024

Gary S Collins, Karel G M Moons, Paula Dhiman, Richard D Riley, Andrew L Beam, Ben Van Calster, Marzyeh Ghassemi, Xiaoxuan Liu, Johannes B Reitsma, Maarten van Smeden, Anne-Laure Boulesteix, Jennifer Catherine Camaradou, Leo Anthony Celi, Spiros Denaxas, Alastair K Denniston, Ben Glocker, Robert M Golub, Hugh Harvey, Georg Heinze, Michael M Hoffman, Andre...

work page 2024

[7] [7]

Collins, Johannes B

Gary S. Collins, Johannes B. Reitsma, Douglas G. Altman, and Karel G.M. Moons. Trans- parent reporting of a multivariable prediction model for individual prognosis or diagnosis (tripod): The tripod statement.Annals of Internal Medicine, 162(1):55?63, January 2015

work page 2015

[8] [8]

Transparent reporting of multivariable pre- diction models developed or validated using clustered data: Tripod-cluster checklist.BMJ, 380:e071018, February 2023

Thomas P A Debray, Gary S Collins, Richard D Riley, Kym I E Snell, Ben Van Calster, Johannes B Reitsma, and Karel G M Moons. Transparent reporting of multivariable pre- diction models developed or validated using clustered data: Tripod-cluster checklist.BMJ, 380:e071018, February 2023

work page 2023

[9] [9]

Georgii.Stochastics: Introduction to Probability and Statistics

H.O. Georgii.Stochastics: Introduction to Probability and Statistics. De Gruyter textbook. Walter De Gruyter, 2008

work page 2008

[10] [10]

Strictly proper scoring rules, prediction, and esti- mation.Journal of the American Statistical Association, 102(477):359–378, 2007

Tilmann Gneiting and Adrian E Raftery. Strictly proper scoring rules, prediction, and esti- mation.Journal of the American Statistical Association, 102(477):359–378, 2007

work page 2007

[11] [11]

David J. Hand. Assessing the performance of classification methods.International Statistical Review, 80(3):400–414, 2012

work page 2012

[12] [12]

Harrell.Regression Modeling Strategies: With Applications to Linear Models, Logis- tic and Ordinal Regression, and Survival Analysis

F.E. Harrell.Regression Modeling Strategies: With Applications to Linear Models, Logis- tic and Ordinal Regression, and Survival Analysis. Springer Series in Statistics. Springer International Publishing, 2015

work page 2015

[13] [13]

R package version 5.2-3

Frank E Harrell Jr.Hmisc: Harrell Miscellaneous, 2025. R package version 5.2-3. 12 LINARD HOESSLY

work page 2025

[14] [14]

On misconceptions about the brier score in binary prediction models.Global Epidemiology, 11:100242, June 2026

Linard Hoessly. On misconceptions about the brier score in binary prediction models.Global Epidemiology, 11:100242, June 2026

work page 2026

[15] [15]

How to evaluate probabilistic prediction models: Key metrics.Journal of Clinical Epidemiology, page 112247, March 2026

Linard Hoessly and Matthew Parry. How to evaluate probabilistic prediction models: Key metrics.Journal of Clinical Epidemiology, page 112247, March 2026

work page 2026

[16] [16]

Kerr, Marshall D

Kathleen F. Kerr, Marshall D. Brown, Kehao Zhu, and Holly Janes. Assessing the clinical impact of risk prediction models with decision curves: Guidance for correct interpretation and appropriate use.Journal of Clinical Oncology, 34(21):2534?2540, July 2016

work page 2016

[17] [17]

Knaus, Frank E

William A. Knaus, Frank E. Harrell, Joanne Lynn, Lee Goldman, Russell S. Phillips, Alfred F. Connors, Neal V. Dawson, William J. Fulkerson, Robert M. Califf, Norman Desbiens, Peter Layde, Robert K. Oye, Paul E. Bellamy, Rosemarie B. Hakim, and Douglas P. Wagner. The support prognostic model: Objective estimates of survival for seriously ill hospitalized a...

work page 1995

[18] [18]

Kohn and Thomas B

Michael A. Kohn and Thomas B. Newman. Visualizing the value of diagnostic tests and prediction models, part ii. net benefit graphs: net benefit as a function of the exchange rate. Journal of Clinical Epidemiology, 181:111690, May 2025

work page 2025

[19] [19]

Lee, Lynn H

Kerry L. Lee, Lynn H. Woodlief, Eric J. Topol, W. Douglas Weaver, Amadeo Betriu, Jacques Col, Maarten Simoons, Phil Aylward, Frans Van de Werf, and Robert M. Califf. Predictors of 30-day mortality in the era of reperfusion for acute myocardial infarction: Results from an international trial of 41 021 patients.Circulation, 91(6):1659?1668, March 1995

work page 1995

[20] [20]

Springer International Publishing, December 2021

Hendrik-Jan Mijderwijk and Daan Nieboer.Is My Clinical Prediction Model Clinically Use- ful? A Primer on Decision Curve Analysis, page 115?118. Springer International Publishing, December 2021

work page 2021

[21] [21]

Pauker and Jerome P

Stephen G. Pauker and Jerome P. Kassirer. Therapeutic decision making: A cost-benefit analysis.New England Journal of Medicine, 293(5):229?234, July 1975

work page 1975

[22] [22]

Pepe, Jing Fan, Ziding Feng, Thomas Gerds, and Jorgen Hilden

Margaret S. Pepe, Jing Fan, Ziding Feng, Thomas Gerds, and Jorgen Hilden. The net reclassi- fication index (nri): A misleading measure of prediction improvement even with independent test data sets.Statistics in Biosciences, 7(2):282?295, August 2014

work page 2014

[23] [23]

Reilly and Arthur T

Brendan M. Reilly and Arthur T. Evans. Translating clinical research into clinical prac- tice: Impact of using prediction rules to make decisions.Annals of Internal Medicine, 144(3):201?209, February 2006

work page 2006

[24] [24]

How to measure the quality of credit scoring models

Martin Rez´ aˇ c and Frantiˇ sek Rez´ aˇ c. How to measure the quality of credit scoring models. Finance a Uver: Czech Journal of Economics & Finance, 61(5), 2011

work page 2011

[25] [25]

Valentin Rousson and Thomas Zumbrunn. Decision curve analysis revisited: overall net ben- efit, relationships to roc curve analysis, and application to case-control studies.BMC Medical Informatics and Decision Making, 11(1), June 2011

work page 2011

[26] [26]

Use of brier score to assess binary predictions.Journal of Clinical Epidemi- ology, 63(8):938?939, August 2010

Kaspar Rufibach. Use of brier score to assess binary predictions.Journal of Clinical Epidemi- ology, 63(8):938?939, August 2010

work page 2010

[27] [27]

Sjoberg.dcurves: Decision Curve Analysis for Model Evaluation, 2024

Daniel D. Sjoberg.dcurves: Decision Curve Analysis for Model Evaluation, 2024. R package version 0.5.0

work page 2024

[28] [28]

Kym I E Snell, Brooke Levis, Johanna A A Damen, Paula Dhiman, Thomas P A Debray, Lotty Hooft, Johannes B Reitsma, Karel G M Moons, Gary S Collins, and Richard D Riley. Transparent reporting of multivariable prediction models for individual prognosis or diagnosis: checklist for systematic reviews and meta-analyses (tripod-srma).BMJ, 381:e073538, May 2023

work page 2023

[29] [29]

T Sorahan and M S Gilthorpe. Non-differential misclassification of exposure always leads to an underestimate of risk: an incorrect conclusion.Occupational and Environmental Medicine, 51(12):839?840, December 1994

work page 1994

[30] [30]

Steyerberg.Clinical Prediction Models: A Practical Approach to Development, Vali- dation, and Updating

E.W. Steyerberg.Clinical Prediction Models: A Practical Approach to Development, Vali- dation, and Updating. Statistics for Biology and Health. Springer International Publishing, 2019

work page 2019

[31] [31]

Assessing the performance of prediction models: a framework for traditional and novel measures.Epidemiology, 21(1):128– 138, January 2010

Ewout W Steyerberg, Andrew J Vickers, Nancy R Cook, Thomas Gerds, Mithat Gonen, Nancy Obuchowski, Michael J Pencina, and Michael W Kattan. Assessing the performance of prediction models: a framework for traditional and novel measures.Epidemiology, 21(1):128– 138, January 2010

work page 2010

[32] [32]

Using the weighted area under the net benefit curve for decision curve analysis.BMC Medical Informatics and Decision Making, 16(1), July 2016

Rajesh Talluri and Sanjay Shete. Using the weighted area under the net benefit curve for decision curve analysis.BMC Medical Informatics and Decision Making, 16(1), July 2016. DECISION CUR VES, PPV AND CALIBRATION 13

work page 2016

[33] [33]

Verbeek, Jan Y

Ben Van Calster, Laure Wynants, Jan F.M. Verbeek, Jan Y. Verbakel, Evangelia Christodoulou, Andrew J. Vickers, Monique J. Roobol, and Ewout W. Steyerberg. Report- ing and interpreting decision curve analysis: A guide for investigators.European Urology, 74(6):796?804, December 2018

work page 2018

[34] [34]

Verbakel, Ewout W

Jan Y. Verbakel, Ewout W. Steyerberg, Hajime Uno, Bavo De Cock, Laure Wynants, Gary S. Collins, and Ben Van Calster. Roc curves for clinical prediction models part 1. roc plots showed no added value above the auc when evaluating the performance of clinical prediction models.Journal of Clinical Epidemiology, 126:207?216, October 2020

work page 2020

[35] [35]

Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers.The American Statistician, 62(4):314?320, November 2008

Andrew J Vickers. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers.The American Statistician, 62(4):314?320, November 2008

work page 2008

[36] [36]

Vickers and Elena B

Andrew J. Vickers and Elena B. Elkin. Decision curve analysis: A novel method for evaluating prediction models.Medical Decision Making, 26(6):565?574, November 2006

work page 2006

[37] [37]

Net benefit approaches to the evaluation of prediction models, molecular markers, and diagnostic tests.BMJ, page i6, January 2016

Andrew J Vickers, Ben Van Calster, and Ewout W Steyerberg. Net benefit approaches to the evaluation of prediction models, molecular markers, and diagnostic tests.BMJ, page i6, January 2016

work page 2016

[38] [38]

Vickers, Ben van Calster, and Ewout W

Andrew J. Vickers, Ben van Calster, and Ewout W. Steyerberg. A simple, step-by-step guide to interpreting decision curve analysis.Diagnostic and Prognostic Research, 3(1), October 2019

work page 2019

[39] [39]

Framework for the impact anal- ysis and implementation of clinical prediction rules (cprs).BMC Medical Informatics and Decision Making, 11(1), October 2011

Emma Wallace, Susan M Smith, Rafael Perera-Salazar, Paul Vaucher, Colin McCowan, Gary Collins, Jan Verbakel, Monica Lakhanpaul, and Tom Fahey. Framework for the impact anal- ysis and implementation of clinical prediction rules (cprs).BMC Medical Informatics and Decision Making, 11(1), October 2011

work page 2011

[40] [40]

max ( N B(t) N B(t)+ 1 t (I−N B(t)) (1−t) +t, N B(t) N B(t)+ 1 1−t (1−I) (1−t) +t ) ,1 # ,ifN B(t)>0, {0, t},ifN B(t) = 0,

Qian M. Zhou, Lu Zhe, Russell J. Brooke, Melissa M. Hudson, and Yan Yuan. A relationship between the incremental values of area under the ROC curve and of area under the precision- recall curve.Diagnostic and Prognostic Research, 5(1):13, July 2021. AppendixA.Mathematical derivations A.1.Bounds on PPV implied by net benefit.For a fixed incidenceI, the fra...

work page 2021