pith. sign in

arxiv: 2404.17772 · v3 · submitted 2024-04-27 · 📊 stat.ME · stat.CO

PWEXP: An R Package Using Piecewise Exponential Model for Study Design and Event/Timeline Prediction

Pith reviewed 2026-05-24 02:08 UTC · model grok-4.3

classification 📊 stat.ME stat.CO
keywords piecewise exponential modelR packageclinical trial designhazard estimationevent predictionchange-point selectionsurvival analysispower calculation
0
0 comments X

The pith

The PWEXP R package fits piecewise exponential models with automatic change-point selection to support accurate power calculations and event timeline predictions in clinical trials.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces an R package that applies the piecewise exponential hazard model to right-censored survival data. It uses AIC, BIC, and cross-validation log-likelihood to select the number and locations of change-points that best fit the data. This lets the model capture a range of survival patterns without forcing a single parametric form such as the exponential distribution. When the selection works, the resulting hazard estimates support less biased sample-size and power calculations at the design stage and more reliable forecasts of when a target number of events will occur during the trial. The package supplies visualization tools to show the fitted curves and change-points.

Core claim

The PWEXP package estimates piecewise exponential hazard models for right-censored data by selecting the optimal number and positions of change-points according to AIC, BIC, and cross-validation log-likelihood. This produces accurate and robust hazard estimates that can be used for reliable power calculation at study design and timeline prediction at study conduct, offering a superior balance of flexibility and robustness compared with other existing approaches.

What carries the argument

Piecewise exponential hazard model that divides time into segments of constant hazard, with automatic change-point selection via AIC, BIC, and cross-validation log-likelihood.

If this is right

  • Trial designers can obtain sample-size and power figures that reflect a wider set of possible survival curves.
  • Study teams can generate forecasts of the calendar time needed to reach the required event count.
  • Hazard estimates can adapt to changes in risk over time without manual specification of segments.
  • Visualization output makes it easier to inspect where the model places the change-points.
  • The same fitted model supports both initial design calculations and mid-study updates.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same change-point selection procedure could be tested on survival data from fields other than clinical trials, such as reliability or epidemiology.
  • Running the package on datasets where the true change-points are known from external information would directly check whether the selection criteria recover those points.
  • Linking the package output to simulation engines for clinical trials would allow users to evaluate design operating characteristics under the fitted hazard.
  • If selected change-points correspond to known treatment or disease milestones, the model could also serve as an exploratory tool for understanding risk shifts.

Load-bearing premise

Criteria such as AIC, BIC, and cross-validation log-likelihood will reliably identify change-points that produce better out-of-sample predictions of event counts and analysis times than alternative models.

What would settle it

A comparison across multiple right-censored clinical trial datasets in which the PWEXP-selected model's predicted number of events or analysis times shows larger average error than predictions from a standard exponential model or another parametric approach.

Figures

Figures reproduced from arXiv: 2404.17772 by Rachael Wen, Tianchen Xu, Wen Zhang.

Figure 1
Figure 1. Figure 1: Diagram of the process for estimating a PWE model using the pwexp.fit() function [PITH_FULL_IMAGE:figures/full_fig_p010_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Survival function of an example dataset. This is an illustration of different approaches to adjust [PITH_FULL_IMAGE:figures/full_fig_p011_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Main functions and structure of the PWEXP package The PWEXP package follows a software architecture based on functional programming. Each main function returns an instance of an ‘S3’ class, which implements relevant ‘S3’ methods to facilitate downstream analyses. The PWEXP package offers a comprehensive suite of tools tailored for analyzing survival data with a piecewise exponential distribution. It facili… view at source ↗
Figure 4
Figure 4. Figure 4: (a) KM curve for the simulated dataset; (b) Event curve for the simulated dataset. [PITH_FULL_IMAGE:figures/full_fig_p018_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Visualization of PWE models with different number of change-points [PITH_FULL_IMAGE:figures/full_fig_p020_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: BIC of fitted PWE models R> fit_a1_cv <- cv.pwexp.fit(fit_a1, nsim = 100) R> fit_b0_cv <- cv.pwexp.fit(fit_b0, nsim = 100) R> fit_b1_cv <- cv.pwexp.fit(fit_b1, nsim = 100) R> fit_b2_cv <- cv.pwexp.fit(fit_b2, nsim = 100, parallel = TRUE, mc.core = 10) R> fit_b3_cv <- cv.pwexp.fit(fit_b3, nsim = 100, parallel = TRUE, mc.core = 10) R> fit_b4_cv <- cv.pwexp.fit(train$followT, train$event, nbreak = 4, nsim = 1… view at source ↗
Figure 7
Figure 7. Figure 7: Cross validation log-likelihood of fitted PWE models [PITH_FULL_IMAGE:figures/full_fig_p022_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Visualization of the final model with 95% CI [PITH_FULL_IMAGE:figures/full_fig_p023_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Visualization of the fitted censoring model for drop-out with 95% CI [PITH_FULL_IMAGE:figures/full_fig_p024_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Predicted number of future events R> plot_event(train$followT_abs, abs_time = T, event = train$event, add = T, col=’blue’) R> pred_event_confidence <- plot_event(predicted_boot, eval_at = seq(45, 90, 5), + type = ’confidence’) R> pred_event_predictive <- plot_event(predicted_boot, eval_at = seq(45, 90, 5), + type = ’predictive’, CI_par = list(col = ’purple’)) R> legend(’bottomright’, c(’data used to train… view at source ↗
Figure 11
Figure 11. Figure 11: Predicted timeline for given number of future events [PITH_FULL_IMAGE:figures/full_fig_p026_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: OS curve for IPI 4-5 subjects (Ruppert et al., 2020). Colored curves are fitted parametric models. [PITH_FULL_IMAGE:figures/full_fig_p028_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Expected number of events according to theoretical calculations from the [PITH_FULL_IMAGE:figures/full_fig_p032_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Relationship between variables in the generated dataset [PITH_FULL_IMAGE:figures/full_fig_p036_14.png] view at source ↗
read the original abstract

Parametric assumptions such as exponential distribution are commonly used in clinical trial design and analysis. However, violation of distribution assumptions can introduce biases in sample size and power calculations. Piecewise exponential (PWE) hazard model partitions the hazard function into segments each with constant hazards and is easy for interpretation and computation. Due to its piecewise property, PWE can fit a wide range of survival curves and accurately predict the future number of events and analysis time in event-driven clinical trials, thus enabling more flexible and reliable study designs. Compared with other existing approaches, the PWE model provides a superior balance of flexibility and robustness in model fitting and prediction. The proposed PWEXP package is designed for estimating and predicting PWE hazard models for right-censored data. By utilizing well-established criteria such as AIC, BIC, and cross-validation log-likelihood, the PWEXP package chooses the optimal number of change-points and determines the optimal position of change-points. With its particular goodness-of-fit, the PWEXP provides accurate and robust hazard estimation, which can be used for reliable power calculation at study design and timeline prediction at study conduct. The package also offers visualization functions to facilitate the interpretation of survival curve fitting results.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The manuscript presents the PWEXP R package for fitting piecewise exponential (PWE) hazard models to right-censored data. It selects the number and locations of change-points via AIC, BIC, and cross-validation log-likelihood, then uses the fitted models for power/sample-size calculations at the design stage and for event-count/timeline predictions during study conduct. The central claim is that this approach achieves a superior balance of flexibility and robustness relative to other existing methods for clinical-trial applications.

Significance. If the change-point selection reliably yields good out-of-sample predictions, the package would supply a practical, interpretable tool for relaxing the constant-hazard assumption while retaining computational simplicity. The provision of an R implementation together with visualization and prediction utilities constitutes a concrete software contribution that could be adopted by trial statisticians.

major comments (2)
  1. [Abstract] Abstract: the claim that the PWE model 'provides a superior balance of flexibility and robustness in model fitting and prediction' is stated without any accompanying simulation study, real-data benchmark, or quantitative comparison against parametric, semi-parametric, or alternative piecewise methods that would substantiate the asserted superiority.
  2. [Methods (change-point selection)] Section describing change-point selection: reliance on AIC, BIC, and cross-validation log-likelihood is presented as the mechanism for choosing the number and positions of change-points, yet no evidence (simulation or otherwise) is supplied that these criteria produce models with demonstrably better predictive performance for future event counts or analysis times under right-censoring typical of event-driven trials.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful review and constructive comments on our manuscript. We address each major comment below and indicate planned revisions to the manuscript.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the claim that the PWE model 'provides a superior balance of flexibility and robustness in model fitting and prediction' is stated without any accompanying simulation study, real-data benchmark, or quantitative comparison against parametric, semi-parametric, or alternative piecewise methods that would substantiate the asserted superiority.

    Authors: We agree that the abstract's claim of superiority is not supported by quantitative comparisons in the current manuscript. The statement was intended to reflect the model's theoretical properties (piecewise constant hazards allowing flexible shapes while remaining computationally tractable and interpretable for trial design), but we recognize that unsubstantiated superiority claims should be avoided. We will revise the abstract to remove the comparative superiority language and instead describe the PWE approach in terms of its flexibility for hazard estimation and its practical utility for event prediction in right-censored settings. revision: yes

  2. Referee: [Methods (change-point selection)] Section describing change-point selection: reliance on AIC, BIC, and cross-validation log-likelihood is presented as the mechanism for choosing the number and positions of change-points, yet no evidence (simulation or otherwise) is supplied that these criteria produce models with demonstrably better predictive performance for future event counts or analysis times under right-censoring typical of event-driven trials.

    Authors: The manuscript presents AIC, BIC, and cross-validation as standard, well-established criteria for model selection in the PWEXP package, drawing on their common use in survival analysis. However, we acknowledge that the paper does not include simulation studies or benchmarks demonstrating superior out-of-sample predictive performance for event counts or timelines under typical right-censoring patterns. We will add a brief discussion noting the reliance on these established criteria and, if space permits in a revision, include a small illustrative simulation or real-data example to illustrate predictive behavior; alternatively, we can qualify the text to avoid implying unverified superiority in predictive accuracy. revision: partial

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper describes an R package implementing standard piecewise exponential fitting with AIC/BIC/CV-based change-point selection for right-censored data. No equations, derivations, or self-citations appear that reduce any claimed prediction (event counts, timeline, power) to fitted inputs by construction. Model selection uses external criteria, and downstream predictions are ordinary outputs of the fitted model rather than tautological renamings or self-referential steps. The manuscript is self-contained as a software description without load-bearing internal reductions.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The package rests on standard survival-analysis assumptions and established model-selection methods; no new entities or free parameters are introduced beyond those already present in the piecewise exponential literature.

axioms (1)
  • domain assumption Right-censored survival times can be adequately described by a piecewise constant hazard function whose change-points are identifiable by AIC, BIC, or cross-validation.
    Invoked when the package selects the number and positions of change-points for hazard estimation.

pith-pipeline@v0.9.0 · 5743 in / 1228 out tokens · 28909 ms · 2026-05-24T02:08:43.399888+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

  • Cost.FunctionalEquation washburn_uniqueness_aczel unclear
    ?
    unclear

    Relation between the paper passage and the cited Recognition theorem.

    The PWEXP package chooses the optimal number of change-points and determines the optimal position of change-points. With its particular goodness-of-fit, the PWEXP provides accurate and robust hazard estimation...

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

28 extracted references · 28 canonical work pages

  1. [1]

    Anderson, K. (2023). gsDesign : Group Sequential Design . R package version 3.6.0

  2. [2]

    Bagiella, E. and D. F. Heitjan (2001). Predicting analysis times in randomized clinical trials. Statistics in medicine\/ 20\/ (14), 2055--2063

  3. [3]

    Betensky, R. A. (2015). Measures of follow-pp in time-to-event studies: Why provide them and what should they be? Clinical Trials\/ 12\/ (4), 403--408

  4. [4]

    Chapple, A. G., T. Peak, and A. Hemal (2020). A novel bayesian continuous piecewise linear log-hazard model, with estimation and inference via reversible jump markov chain monte carlo. Statistics in Medicine\/ 39\/ (12), 1766--1780

  5. [5]

    Cooney, P. and A. White (2021a). Change-point detection for piecewise exponential models. arXiv preprint arXiv:2112.03962\/

  6. [6]

    Cooney, P. and A. White (2021b). Change-point detection for piecewise exponential models. arXiv\/ , 1--21

  7. [7]

    Dupuy, J.-F. (2006). Estimation in a change-point hazard regression model. Statistics & Probability Letters\/ 76\/ (2), 182--190

  8. [8]

    Friedman, M. (1982). Piecewise exponential models for survival data with covariates. The Annals of Statistics\/ 10\/ (1), 101--113

  9. [9]

    Frumento, P. (2021). pch : Piecewise Constant Hazard Models for Censored and Truncated Data . R package version 2.0

  10. [10]

    Goodman, M. S., Y. Li, and R. C. Tiwari (2011). Detecting multiple change points in piecewise constant hazard functions. Journal of Applied Statistics\/ 38\/ (11), 2523--2532

  11. [11]

    Henderson, R. (1990). A problem with the likelihood ratio test for a change-point hazard rate model. Biometrika\/ 77\/ (4), 835--843

  12. [12]

    Hess, K. and G. Robert (2021). muhaz : Hazard Function Estimation in Survival Analysis . R package version 1.2.6.4

  13. [13]

    Cheon, and Z

    Kim, J., S. Cheon, and Z. Jin (2020). Bayesian multiple change-points estimation for hazard with censored survival data from exponential distributions. Journal of the Korean Statistical Society\/ 49 , 15--31

  14. [14]

    Klein, J. P., M. L. Moeschberger, et al. (2003). Survival Analysis: Techniques for Censored and Truncated Data , Volume 1230. Springer

  15. [15]

    K\"uchenhoff, H. (1996). An exact algorithm for estimating breakpoints in segmented generalized linear models

  16. [16]

    Qian, and W

    Li, Y., L. Qian, and W. Zhang (2013). Estimation in a change-point hazard regression model with long-term survivors. Statistics & Probability Letters\/ 83\/ (7), 1683--1691

  17. [17]

    Zhou, and J

    Liu, N., Y. Zhou, and J. J. Lee (2021). Ipdfromkm: Reconstruct individual patient data from published kaplan-meier survival curves. BMC medical Research Methodology\/ 21\/ (1), 111

  18. [18]

    Loubert, S. K. (1986). Inference Procedures for the Piecewise Exponential Model When the Data Are Arbitrarily Censored . Iowa State University

  19. [19]

    Matthews, D. E. and V. T. Farewell (1982). On testing for a constant hazard against a change-point alternative. Biometrics\/ , 463--468

  20. [20]

    Muggeo, V. M. (2003). Estimating regression models with unknown break-points. Statistics in Medicine\/ 22\/ (19), 3055--3071

  21. [21]

    Qian, L. and W. Zhang (2013). Multiple change-point detection in piecewise exponential hazard regression models with long-term survivors and right censoring. In Contemporary Developments in Statistical Theory: A Festschrift for Hira Lal Koul , pp.\ 289--304. Springer

  22. [22]

    Rohatgi, A. (2024). Webplotdigitizer: Version 4.7

  23. [23]

    Rufibach, K. (2022). eventTrack : Event Prediction for Time-to-Event Endpoints . R package version 1.0.2

  24. [24]

    Ruppert, A. S., J. G. Dixon, G. Salles, A. Wall, D. Cunningham, V. Poeschel, C. Haioun, H. Tilly, H. Ghesquieres, M. Ziepert, et al. (2020). International prognostic indices in diffuse large b-cell lymphoma: A comparison of ipi, r-ipi, and nccn-ipi. Blood, The Journal of the American Society of Hematology\/ 135\/ (23), 2041--2048

  25. [25]

    Wang, J., C. Ke, Q. Jiang, C. Zhang, and S. Snapinn (2012). Predicting analysis time in event-driven clinical trials with event-reporting lag. Statistics in Medicine\/ 31\/ (9), 801--811

  26. [26]

    Wassmer, G. and F. Pahlke (2024). rpact : Confirmatory Adaptive Clinical Trial Design and Analysis . R package version 3.5.1

  27. [27]

    Yao, Y.-C. (1986). Maximum likelihood estimation in hazard rate models with a change-point. Communications in Statistics-Theory and Methods\/ 15\/ (8), 2455--2466

  28. [28]

    Ying, G.-s. and D. F. Heitjan (2008). Weibull prediction of event times in clinical trials. Pharmaceutical Statistics: The Journal of Applied Statistics in the Pharmaceutical Industry\/ 7\/ (2), 107--120