Why decision curves go above or below treat-all and treat-none: a PPV- and calibration-based guide for clinical prediction models
Pith reviewed 2026-05-21 09:39 UTC · model grok-4.3
The pith
Net benefit comparisons to treat-all and treat-none reduce to threshold-specific observed risks, connecting decision curves to subgroup calibration and positive predictive value.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Comparisons with treat-none and treat-all can be expressed through threshold-specific observed risk in patients above and below the decision threshold, linking decision-curve performance to calibration in clinically relevant subgroups. Net benefit also relates to positive predictive value, offering a more intuitive explanation of when acting on model predictions is justified. The derivations are illustrated and positive predictive value curves are proposed as a practical complement to decision curves.
What carries the argument
Threshold-specific observed risk above and below the decision threshold, together with its algebraic link to positive predictive value.
If this is right
- A model that is well calibrated among patients above the threshold will produce a decision curve above the treat-none line.
- Net benefit becomes positive when positive predictive value at the threshold exceeds the harm-to-benefit ratio of treatment.
- Positive predictive value curves supply an alternative visual check on the same information that decision curves display.
- Poor calibration in the high-risk subgroup directly lowers or eliminates the apparent advantage of the model over treat-none.
Where Pith is reading between the lines
- This framing could guide targeted recalibration efforts focused only on the risk range where decisions are actually made.
- Clinicians might choose thresholds by inspecting positive predictive value rather than net benefit alone.
- The same observed-risk decomposition might apply to other threshold-based decision metrics beyond net benefit.
Load-bearing premise
That threshold-specific observed risks and positive predictive values directly represent clinical utility without further conditions on data quality or population traits.
What would settle it
A dataset in which the net benefit value computed at a threshold fails to match the value obtained from the observed risk among patients whose predicted risk exceeds that threshold.
Figures
read the original abstract
Net benefit is widely used and reported to evaluate the clinical utility of prediction models, yet its interpretation often remains difficult in practice. In this didactical note, we develop two complementary interpretations that make net benefit easier to understand for clinical audiences. We show that comparisons with treat-none and treat-all can be expressed through threshold-specific observed risk in patients above and below the decision threshold, linking decision-curve performance to calibration in clinically relevant subgroups. We also show how net benefit relates to positive predictive value, offering a more intuitive explanation of when acting on model predictions is justified. We derive and illustrate these results and propose positive predictive value curves as a practical complement to decision curves.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper is a didactical note deriving two complementary interpretations of net benefit in decision-curve analysis for clinical prediction models. It shows that net-benefit comparisons against treat-all and treat-none can be algebraically re-expressed using the empirical event rate (observed risk) among patients whose predicted probability exceeds the decision threshold (and symmetrically below it), thereby linking decision-curve performance to calibration within clinically relevant subgroups. It further relates net benefit at a given threshold to the positive predictive value of the model at that threshold. The derivations are illustrated and the authors propose PPV curves as a practical complement to decision curves.
Significance. If the algebraic identities hold, the manuscript offers a useful pedagogical contribution by grounding the interpretation of decision curves in observable quantities (subgroup calibration and PPV) rather than the abstract net-benefit formula alone. The derivations are parameter-free and follow directly from substitution into the standard net-benefit expression, so they hold for any fixed threshold and any joint distribution of predictions and outcomes. This strengthens the practical teaching and application of decision-curve analysis without introducing new empirical claims or assumptions about data quality.
minor comments (3)
- The manuscript would benefit from an explicit statement early in the derivations section confirming that the re-expressions are identities that hold by definition once the standard net-benefit formula is substituted, to avoid any impression of additional modeling assumptions.
- Figure legends and axis labels for the proposed PPV curves should more clearly distinguish them from the conventional decision curves and indicate whether the PPV axis is plotted on the same probability scale.
- A brief note on the handling of ties or continuous versus discrete thresholds would clarify the practical computation of the threshold-specific observed risks.
Simulated Author's Rebuttal
We thank the referee for their positive and constructive assessment of our didactical note. We appreciate the recognition that the algebraic links between net benefit, subgroup calibration, and positive predictive value offer a useful pedagogical contribution to decision-curve analysis. No specific major comments were raised in the report, so we have no points to address individually. We will incorporate any minor editorial suggestions during revision.
Circularity Check
No significant circularity identified
full rationale
The paper's central derivations consist of algebraic re-expressions that substitute the standard net-benefit formula into expressions involving threshold-specific observed risks and positive predictive value. These identities hold by definition for any fixed threshold and any joint distribution of predictions and outcomes, without fitted parameters, self-referential equations, or load-bearing self-citations. The derivations are self-contained, drawing only on prior standard definitions of net benefit, PPV, and calibration rather than reducing to the paper's own inputs by construction.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Standard definitions of positive predictive value and calibration as functions of predicted and observed risks hold in the relevant patient subgroups.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
NB(t) = st/(1-t) (¯Y≥t − t) and NB(t) > 0 ⇔ PPV(t) > t
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Anthony K Akobeng. Understanding diagnostic tests 1: sensitivity, specificity and predictive values.Acta Paediatrica, 96(3):338?341, February 2007
work page 2007
-
[2]
Statistics notes: Diagnostic tests 2: predictive values
Douglas G Altman and J Martin Bland. Statistics notes: Diagnostic tests 2: predictive values. BMJ, 309(6947):102, 1994
work page 1994
-
[3]
Ben Van Calster, Gary S. Collins, Andrew J. Vickers, Laure Wynants, Kathleen F. Kerr, Lasai Barrenada, Gael Varoquaux, Karandeep Singh, Karel G. M. Moons, Tina Hernandez- boussard, Dirk Timmerman, David J. Mclernon, Maarten Van Smeden, and Ewout W. Steyer- berg. Performance evaluation of predictive ai models to support medical decisions: Overview and guid...
work page 2024
-
[4]
Ben Van Calster and Andrew J. Vickers. Calibration of risk prediction models: Impact on decision-analytic performance.Medical Decision Making, 35(2):162–169, 2015. PMID: 25155798
work page 2015
-
[5]
G. S. Collins and D. G. Altman. Predicting the 10 year risk of cardiovascular disease in the united kingdom: independent and external validation of an updated version of qrisk2.BMJ, 344(jun21 1):e4181?e4181, June 2012
work page 2012
-
[6]
Gary S Collins, Karel G M Moons, Paula Dhiman, Richard D Riley, Andrew L Beam, Ben Van Calster, Marzyeh Ghassemi, Xiaoxuan Liu, Johannes B Reitsma, Maarten van Smeden, Anne-Laure Boulesteix, Jennifer Catherine Camaradou, Leo Anthony Celi, Spiros Denaxas, Alastair K Denniston, Ben Glocker, Robert M Golub, Hugh Harvey, Georg Heinze, Michael M Hoffman, Andre...
work page 2024
-
[7]
Gary S. Collins, Johannes B. Reitsma, Douglas G. Altman, and Karel G.M. Moons. Trans- parent reporting of a multivariable prediction model for individual prognosis or diagnosis (tripod): The tripod statement.Annals of Internal Medicine, 162(1):55?63, January 2015
work page 2015
-
[8]
Thomas P A Debray, Gary S Collins, Richard D Riley, Kym I E Snell, Ben Van Calster, Johannes B Reitsma, and Karel G M Moons. Transparent reporting of multivariable pre- diction models developed or validated using clustered data: Tripod-cluster checklist.BMJ, 380:e071018, February 2023
work page 2023
-
[9]
Georgii.Stochastics: Introduction to Probability and Statistics
H.O. Georgii.Stochastics: Introduction to Probability and Statistics. De Gruyter textbook. Walter De Gruyter, 2008
work page 2008
-
[10]
Tilmann Gneiting and Adrian E Raftery. Strictly proper scoring rules, prediction, and esti- mation.Journal of the American Statistical Association, 102(477):359–378, 2007
work page 2007
-
[11]
David J. Hand. Assessing the performance of classification methods.International Statistical Review, 80(3):400–414, 2012
work page 2012
-
[12]
F.E. Harrell.Regression Modeling Strategies: With Applications to Linear Models, Logis- tic and Ordinal Regression, and Survival Analysis. Springer Series in Statistics. Springer International Publishing, 2015
work page 2015
-
[13]
Frank E Harrell Jr.Hmisc: Harrell Miscellaneous, 2025. R package version 5.2-3. 12 LINARD HOESSLY
work page 2025
-
[14]
Linard Hoessly. On misconceptions about the brier score in binary prediction models.Global Epidemiology, 11:100242, June 2026
work page 2026
-
[15]
Linard Hoessly and Matthew Parry. How to evaluate probabilistic prediction models: Key metrics.Journal of Clinical Epidemiology, page 112247, March 2026
work page 2026
-
[16]
Kathleen F. Kerr, Marshall D. Brown, Kehao Zhu, and Holly Janes. Assessing the clinical impact of risk prediction models with decision curves: Guidance for correct interpretation and appropriate use.Journal of Clinical Oncology, 34(21):2534?2540, July 2016
work page 2016
-
[17]
William A. Knaus, Frank E. Harrell, Joanne Lynn, Lee Goldman, Russell S. Phillips, Alfred F. Connors, Neal V. Dawson, William J. Fulkerson, Robert M. Califf, Norman Desbiens, Peter Layde, Robert K. Oye, Paul E. Bellamy, Rosemarie B. Hakim, and Douglas P. Wagner. The support prognostic model: Objective estimates of survival for seriously ill hospitalized a...
work page 1995
-
[18]
Michael A. Kohn and Thomas B. Newman. Visualizing the value of diagnostic tests and prediction models, part ii. net benefit graphs: net benefit as a function of the exchange rate. Journal of Clinical Epidemiology, 181:111690, May 2025
work page 2025
-
[19]
Kerry L. Lee, Lynn H. Woodlief, Eric J. Topol, W. Douglas Weaver, Amadeo Betriu, Jacques Col, Maarten Simoons, Phil Aylward, Frans Van de Werf, and Robert M. Califf. Predictors of 30-day mortality in the era of reperfusion for acute myocardial infarction: Results from an international trial of 41 021 patients.Circulation, 91(6):1659?1668, March 1995
work page 1995
-
[20]
Springer International Publishing, December 2021
Hendrik-Jan Mijderwijk and Daan Nieboer.Is My Clinical Prediction Model Clinically Use- ful? A Primer on Decision Curve Analysis, page 115?118. Springer International Publishing, December 2021
work page 2021
-
[21]
Stephen G. Pauker and Jerome P. Kassirer. Therapeutic decision making: A cost-benefit analysis.New England Journal of Medicine, 293(5):229?234, July 1975
work page 1975
-
[22]
Pepe, Jing Fan, Ziding Feng, Thomas Gerds, and Jorgen Hilden
Margaret S. Pepe, Jing Fan, Ziding Feng, Thomas Gerds, and Jorgen Hilden. The net reclassi- fication index (nri): A misleading measure of prediction improvement even with independent test data sets.Statistics in Biosciences, 7(2):282?295, August 2014
work page 2014
-
[23]
Brendan M. Reilly and Arthur T. Evans. Translating clinical research into clinical prac- tice: Impact of using prediction rules to make decisions.Annals of Internal Medicine, 144(3):201?209, February 2006
work page 2006
-
[24]
How to measure the quality of credit scoring models
Martin Rez´ aˇ c and Frantiˇ sek Rez´ aˇ c. How to measure the quality of credit scoring models. Finance a Uver: Czech Journal of Economics & Finance, 61(5), 2011
work page 2011
-
[25]
Valentin Rousson and Thomas Zumbrunn. Decision curve analysis revisited: overall net ben- efit, relationships to roc curve analysis, and application to case-control studies.BMC Medical Informatics and Decision Making, 11(1), June 2011
work page 2011
-
[26]
Kaspar Rufibach. Use of brier score to assess binary predictions.Journal of Clinical Epidemi- ology, 63(8):938?939, August 2010
work page 2010
-
[27]
Sjoberg.dcurves: Decision Curve Analysis for Model Evaluation, 2024
Daniel D. Sjoberg.dcurves: Decision Curve Analysis for Model Evaluation, 2024. R package version 0.5.0
work page 2024
-
[28]
Kym I E Snell, Brooke Levis, Johanna A A Damen, Paula Dhiman, Thomas P A Debray, Lotty Hooft, Johannes B Reitsma, Karel G M Moons, Gary S Collins, and Richard D Riley. Transparent reporting of multivariable prediction models for individual prognosis or diagnosis: checklist for systematic reviews and meta-analyses (tripod-srma).BMJ, 381:e073538, May 2023
work page 2023
-
[29]
T Sorahan and M S Gilthorpe. Non-differential misclassification of exposure always leads to an underestimate of risk: an incorrect conclusion.Occupational and Environmental Medicine, 51(12):839?840, December 1994
work page 1994
-
[30]
E.W. Steyerberg.Clinical Prediction Models: A Practical Approach to Development, Vali- dation, and Updating. Statistics for Biology and Health. Springer International Publishing, 2019
work page 2019
-
[31]
Ewout W Steyerberg, Andrew J Vickers, Nancy R Cook, Thomas Gerds, Mithat Gonen, Nancy Obuchowski, Michael J Pencina, and Michael W Kattan. Assessing the performance of prediction models: a framework for traditional and novel measures.Epidemiology, 21(1):128– 138, January 2010
work page 2010
-
[32]
Rajesh Talluri and Sanjay Shete. Using the weighted area under the net benefit curve for decision curve analysis.BMC Medical Informatics and Decision Making, 16(1), July 2016. DECISION CUR VES, PPV AND CALIBRATION 13
work page 2016
-
[33]
Ben Van Calster, Laure Wynants, Jan F.M. Verbeek, Jan Y. Verbakel, Evangelia Christodoulou, Andrew J. Vickers, Monique J. Roobol, and Ewout W. Steyerberg. Report- ing and interpreting decision curve analysis: A guide for investigators.European Urology, 74(6):796?804, December 2018
work page 2018
-
[34]
Jan Y. Verbakel, Ewout W. Steyerberg, Hajime Uno, Bavo De Cock, Laure Wynants, Gary S. Collins, and Ben Van Calster. Roc curves for clinical prediction models part 1. roc plots showed no added value above the auc when evaluating the performance of clinical prediction models.Journal of Clinical Epidemiology, 126:207?216, October 2020
work page 2020
-
[35]
Andrew J Vickers. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers.The American Statistician, 62(4):314?320, November 2008
work page 2008
-
[36]
Andrew J. Vickers and Elena B. Elkin. Decision curve analysis: A novel method for evaluating prediction models.Medical Decision Making, 26(6):565?574, November 2006
work page 2006
-
[37]
Andrew J Vickers, Ben Van Calster, and Ewout W Steyerberg. Net benefit approaches to the evaluation of prediction models, molecular markers, and diagnostic tests.BMJ, page i6, January 2016
work page 2016
-
[38]
Vickers, Ben van Calster, and Ewout W
Andrew J. Vickers, Ben van Calster, and Ewout W. Steyerberg. A simple, step-by-step guide to interpreting decision curve analysis.Diagnostic and Prognostic Research, 3(1), October 2019
work page 2019
-
[39]
Emma Wallace, Susan M Smith, Rafael Perera-Salazar, Paul Vaucher, Colin McCowan, Gary Collins, Jan Verbakel, Monica Lakhanpaul, and Tom Fahey. Framework for the impact anal- ysis and implementation of clinical prediction rules (cprs).BMC Medical Informatics and Decision Making, 11(1), October 2011
work page 2011
-
[40]
Qian M. Zhou, Lu Zhe, Russell J. Brooke, Melissa M. Hudson, and Yan Yuan. A relationship between the incremental values of area under the ROC curve and of area under the precision- recall curve.Diagnostic and Prognostic Research, 5(1):13, July 2021. AppendixA.Mathematical derivations A.1.Bounds on PPV implied by net benefit.For a fixed incidenceI, the fra...
work page 2021
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.