Personalized Decision Making for Biopsies in Prostate Cancer Active Surveillance Programs

Anirudh Tomer; Daan Nieboer; Dimitris Rizopoulos; Ewout W. Steyerberg; Frank-Jan Drost; Monique J. Roobol

arxiv: 1907.05621 · v1 · pith:OUFOYYANnew · submitted 2019-07-12 · 📊 stat.AP · stat.ME

Personalized Decision Making for Biopsies in Prostate Cancer Active Surveillance Programs

Anirudh Tomer , Dimitris Rizopoulos , Daan Nieboer , Frank-Jan Drost , Monique J. Roobol , Ewout W. Steyerberg This is my paper

Pith reviewed 2026-05-24 22:22 UTC · model grok-4.3

classification 📊 stat.AP stat.ME

keywords prostate canceractive surveillancejoint modelspersonalized schedulingbiopsy decisionscumulative riskPRIAS cohort

0 comments

The pith

Joint models generate visit-specific progression risks that trigger biopsies only when needed, cutting the median number from ten to four with nearly identical detection delays.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a decision rule that personalizes biopsy timing for men in prostate cancer active surveillance by updating each patient's cumulative risk of progression at every visit. It fits joint models to repeated PSA and DRE measurements plus prior biopsy outcomes, then schedules the next biopsy only if the updated risk crosses a chosen threshold. In simulations that replay the actual PRIAS cohort of 5270 patients, this rule produces substantially fewer biopsies than the standard annual schedule while keeping the time to detected progression almost the same. The central payoff is a better trade-off between procedural burden and timely identification of progression.

Core claim

Joint models for longitudinal PSA, DRE and time-to-progression data produce a patient- and visit-specific cumulative risk; a biopsy is performed at the current visit only when this risk exceeds a pre-specified threshold, resulting in a median of four biopsies (IQR 2-5) versus ten (IQR 3-10) under annual scheduling and median detection delays of 0.7 years (IQR 0.3-1.0) versus 0.5 years (IQR 0.3-0.8).

What carries the argument

Joint model that combines longitudinal trajectories of PSA and DRE with a survival sub-model for time to progression, yielding an updated cumulative risk at each follow-up visit that is compared directly against a decision threshold.

If this is right

The number of biopsies required per detected progression drops substantially while the timing of detection stays comparable.
Patients experience fewer invasive procedures without a meaningful increase in the window during which progression could remain undetected.
The same risk-threshold rule can be recalibrated for different acceptable delay tolerances by changing the threshold value.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The approach could be extended to incorporate patient-specific utilities for biopsy burden versus delay, turning the fixed threshold into a personalized utility-weighted one.
If the joint model is updated with new data streams such as MRI results, the risk estimates and resulting biopsy savings could increase further.
Similar risk-based scheduling logic might apply to surveillance protocols in other low-risk cancers where repeated invasive tests are currently used on a fixed calendar.

Load-bearing premise

The joint model fitted to historical PRIAS measurements produces accurate patient-specific cumulative risks of progression at future visits that can be directly thresholded for biopsy decisions.

What would settle it

A prospective trial in which the personalized arm shows a median increase in detection delay of more than one year or a median biopsy reduction of fewer than two procedures compared with annual scheduling.

read the original abstract

Background: Low-risk prostate cancer patients enrolled in active surveillance programs commonly undergo biopsies for examination of cancer progression. Biopsies are conducted as per a fixed and frequent schedule (e.g., annual biopsies). Since biopsies are burdensome, patients do not always comply with the schedule, which increases the risk of delayed detection of cancer progression. Objective: Our aim is to better balance the number of biopsies (burden) and the delay in detection of cancer progression (less is beneficial), by personalizing the decision of conducting biopsies. Data Sources: We use patient data of the world's largest active surveillance program (PRIAS). It enrolled 5270 patients, had 866 cancer progressions, and an average of nine prostate-specific antigen (PSA) and five digital rectal examination (DRE) measurements per patient. Methods: Using joint models for time-to-event and longitudinal data, we model the historical DRE and PSA measurements, and biopsy results of a patient at each follow-up visit. This results in a visit and patient-specific cumulative risk of cancer progression. If this risk is above a certain threshold, we schedule a biopsy. We compare this personalized approach with the currently practiced biopsy schedules via an extensive and realistic simulation study, based on a replica of the patients from the PRIAS program. Results: The personalized approach saved a median of six biopsies (median: 4, IQR: 2-5), compared to the annual schedule (median: 10, IQR: 3-10). However, the delay in detection of progression (years) is similar for the personalized (median: 0.7, IQR: 0.3-1.0) and the annual schedule (median: 0.5, IQR: 0.3-0.8). Conclusions: We conclude that personalized schedules provide substantially better balance in the number of biopsies per detected progression for men with low-risk prostate cancer.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Simulation from a fitted joint model on PRIAS data shows a risk-threshold biopsy rule cuts median biopsies from 10 to 4 with comparable detection delay, but the gains are likely optimistic because the test trajectories come from the same model.

read the letter

The core result is a simulation that applies a joint model for PSA, DRE, and progression to decide biopsies only when the patient-specific cumulative risk exceeds a threshold. In the PRIAS replica this yields roughly six fewer biopsies per patient than annual scheduling while keeping median detection delay within 0.2 years. That is the practical claim worth noting. The work is new mainly as an application: it takes existing joint-model machinery and turns the predicted risk into an explicit visit-by-visit decision rule, then runs it through a large, realistic simulation that mimics the actual PRIAS follow-up pattern and compliance issues. The numbers are reported clearly and the comparison to the fixed annual schedule is straightforward. The soft spot is the simulation design itself. Because the longitudinal trajectories and event times are generated from the fitted joint model, the risk predictions are effectively in-sample. Any misspecification in the association between PSA slope and hazard, or in the baseline hazard, will not be exposed and will make the biopsy savings look better than they would on fresh patients. No external cohort or held-out real trajectories are mentioned, and the abstract gives no information on how the risk threshold was chosen or whether it was tuned on the same data. The circularity is moderate rather than fatal, but it does mean the reported medians should be treated as best-case under correct model specification. This paper is for statisticians and urologists working on active-surveillance protocols who want a concrete example of risk-based scheduling. It is worth sending to peer review because the clinical question is real, the data source is large, and the simulation is at least transparent about its setup; referees can ask for external validation or sensitivity checks on the threshold.

Referee Report

3 major / 2 minor

Summary. The manuscript proposes personalizing biopsy decisions in prostate cancer active surveillance by fitting joint models to longitudinal PSA/DRE measurements and time-to-progression outcomes from the PRIAS cohort (5270 patients). At each visit, a patient-specific cumulative risk of progression is computed and a biopsy is scheduled if the risk exceeds a threshold. This strategy is compared to fixed schedules (e.g., annual) in a simulation study that replicates PRIAS patients, with the central claim being a median reduction of six biopsies (4 vs. 10) while maintaining comparable detection delay (0.7 vs. 0.5 years).

Significance. If the performance metrics hold under external validation, the work could reduce biopsy burden in low-risk prostate cancer management without compromising timely detection of progression, addressing a practical compliance issue in active surveillance programs. The scale of the PRIAS data and the use of joint models for dynamic risk prediction are strengths that would support clinical translation if the simulation bias concerns are resolved.

major comments (3)

[Methods (simulation study)] Methods, simulation study description: The evaluation generates trajectories and event times from the same fitted joint model used to produce the risk predictions (a 'replica of the patients from the PRIAS program'), creating an in-sample assessment. This setup risks optimistic bias in the reported median biopsy savings and delay metrics, as model misspecification (e.g., in the association between PSA trajectory and progression hazard) would not be detected; no held-out real patients or external cohorts are mentioned.
[Methods (joint model and decision rule)] Methods, risk threshold and joint model fitting: The threshold for scheduling biopsies is invoked without stating whether it was pre-specified, cross-validated, or tuned on the same PRIAS data that supplies the simulation patients. Absence of sensitivity analysis to threshold choice directly affects the robustness of the central claim that the personalized schedule achieves a median of six fewer biopsies.
[Results / Abstract] Results and Abstract: The reported performance metrics (biopsy counts and delays) are generated from quantities defined by the fitted model itself, with no reported external validation, missing-data handling details, or calibration checks for the cumulative risk predictions; this circularity undermines the conclusion that personalized schedules provide 'substantially better balance'.

minor comments (2)

[Abstract] Abstract: Provide the numerical value of the risk threshold used and note any sensitivity analyses, as these are essential for interpreting the simulation results.
[Methods] Notation: Clarify whether the cumulative risk is the dynamic prediction from the joint model at each visit or a fixed quantity, to avoid ambiguity in the decision rule.

Simulated Author's Rebuttal

3 responses · 1 unresolved

We thank the referee for the constructive comments on our manuscript. We address each of the major comments below and have made revisions to the manuscript accordingly where appropriate.

read point-by-point responses

Referee: Methods, simulation study description: The evaluation generates trajectories and event times from the same fitted joint model used to produce the risk predictions (a 'replica of the patients from the PRIAS program'), creating an in-sample assessment. This setup risks optimistic bias in the reported median biopsy savings and delay metrics, as model misspecification (e.g., in the association between PSA trajectory and progression hazard) would not be detected; no held-out real patients or external cohorts are mentioned.

Authors: The simulation study is designed to evaluate the decision-making strategy under the data-generating process defined by the joint model fitted to the PRIAS data. This approach is common in methodological papers on dynamic prediction to allow access to the true event times for calculating detection delays. We agree that this may lead to optimistic performance estimates if the model is misspecified, and we have added a paragraph in the Discussion section acknowledging this limitation and the need for future external validation studies. revision: yes
Referee: Methods, risk threshold and joint model fitting: The threshold for scheduling biopsies is invoked without stating whether it was pre-specified, cross-validated, or tuned on the same PRIAS data that supplies the simulation patients. Absence of sensitivity analysis to threshold choice directly affects the robustness of the central claim that the personalized schedule achieves a median of six fewer biopsies.

Authors: The risk threshold was selected to achieve a clinically acceptable trade-off between biopsy reduction and detection delay, based on exploratory analyses. To address the concern, we have performed a sensitivity analysis by varying the threshold and included the results in the revised manuscript, showing that the biopsy savings remain substantial across a range of thresholds. revision: yes
Referee: Results / Abstract: The reported performance metrics (biopsy counts and delays) are generated from quantities defined by the fitted model itself, with no reported external validation, missing-data handling details, or calibration checks for the cumulative risk predictions; this circularity undermines the conclusion that personalized schedules provide 'substantially better balance'.

Authors: We have clarified in the Methods that the joint model accounts for irregular visit times and missing measurements via the shared random effects. Additionally, we have added calibration assessments for the risk predictions in the supplementary materials. We have revised the Abstract and Conclusions to emphasize that the results are based on simulation under the fitted model, and moderated the claim of 'substantially better balance' to reflect this. revision: partial

standing simulated objections not resolved

External validation using held-out or independent patient cohorts, as no such data were available for this analysis.

Circularity Check

1 steps flagged

Simulation study evaluates personalized biopsy policy on trajectories generated from the same fitted joint model

specific steps

fitted input called prediction [Methods (joint model + simulation study)]
"Using joint models for time-to-event and longitudinal data, we model the historical DRE and PSA measurements, and biopsy results of a patient at each follow-up visit. This results in a visit and patient-specific cumulative risk of cancer progression. If this risk is above a certain threshold, we schedule a biopsy. We compare this personalized approach with the currently practiced biopsy schedules via an extensive and realistic simulation study, based on a replica of the patients from the PRIAS program."

The joint model is estimated on the PRIAS data; the simulation then creates replica patients whose PSA/DRE trajectories and progression times are drawn from that same fitted model. Applying the risk-threshold rule (also derived from the model) to these simulated trajectories therefore evaluates the policy on data whose generative process is identical to the model, rendering the biopsy-savings and delay figures statistically dependent on the model being correctly specified.

full rationale

The paper fits a joint model to the full PRIAS cohort to obtain patient-specific cumulative progression risks, then thresholds those risks to decide biopsies. The reported performance (median 6 fewer biopsies, comparable 0.7-year delay) is obtained from a simulation study that replicates PRIAS patients. Because the simulation generates new longitudinal trajectories and event times from the fitted joint model itself, the risk predictions and resulting biopsy counts are computed on data whose distribution is defined by the model parameters; any misspecification is invisible and the metrics reduce to in-sample behavior of the fitted model. This matches the fitted-input-called-prediction pattern and forces the central claim.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on a fitted joint model whose parameters are estimated from the same PRIAS cohort used for simulation; the risk threshold is an additional tuning choice whose selection process is not detailed in the abstract.

free parameters (1)

risk threshold for biopsy
Chosen value above which a biopsy is scheduled; directly determines the reported trade-off between biopsy count and detection delay.

axioms (1)

domain assumption Joint model assumptions (random effects structure, proportional hazards, correct specification of longitudinal trajectories for PSA and DRE)
Invoked when the model produces the visit-specific cumulative risk used for decisions.

pith-pipeline@v0.9.0 · 5917 in / 1487 out tokens · 31159 ms · 2026-05-24T22:22:10.477032+00:00 · methodology

Personalized Decision Making for Biopsies in Prostate Cancer Active Surveillance Programs

Core claim

What carries the argument

If this is right

Where Pith is reading between the lines

Load-bearing premise

What would settle it

discussion (0)