pith. sign in

arxiv: 2604.21491 · v1 · submitted 2026-04-23 · 💻 cs.CR · stat.AP· stat.ME

Benchmarking the Utility of Privacy-Preserving Cox Regression Under Data-Driven Clipping Bounds: A Multi-Dataset Simulation Study

Pith reviewed 2026-05-09 21:30 UTC · model grok-4.3

classification 💻 cs.CR stat.APstat.ME
keywords differential privacycox regressionsurvival analysisutility evaluationlaplace mechanismclipping boundsprivacy-utility tradeoff
0
0 comments X

The pith

Differential privacy at standard levels erases statistical significance for most predictors in Cox survival models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper benchmarks how differential privacy mechanisms affect the utility of Cox proportional hazards regression when applied to clinical survival data. Using five datasets ranging from small to moderate size and thousands of simulations, it shows that at privacy budgets commonly considered safe (ε ≤ 1), nearly all significant covariates lose their significance and model predictions become no better than chance. Different ways of adding noise—through input perturbations on covariates or outputs—are compared, revealing that perturbing only covariates works best among input methods while output perturbation holds up better at moderate privacy levels. The work concludes that practical applications require substantially larger privacy budgets than those typically used.

Core claim

At standard DP levels (ε ≤ 1), approximately 90% (90–94%) of the significant covariates lost significance, even in the largest dataset (n = 6,524), and the predictive performance approached random levels (test C-index ≈ 0.5) under many conditions. Among input perturbation approaches, perturbing only covariates preserved the risk-set structure and achieved the best recovery, whereas output perturbation maintained near-baseline performance at ε ≥ 5. At n ≈ 3,000, the significance recovered rapidly at ε = 3--10; however, in practice, ε ≥ 10 (for predictive performance) to ε ≥ 30--60 (for significance preservation) is required.

What carries the argument

Data-driven clipping bounds based on observed min/max values combined with Laplace mechanism and Randomized Response applied either to model inputs or to dfbeta-based output sensitivities in the Cox model.

If this is right

  • Covariate-only input perturbation best preserves the risk-set structure and recovers significance better than perturbing all inputs.
  • Output perturbation using dfbeta sensitivities maintains performance close to non-private baselines once ε reaches 5 or higher.
  • Sample sizes around 3000 allow rapid recovery of significance at ε values between 3 and 10.
  • Variables with p-values near the significance threshold show increased false-positive rates in moderate-to-high ε ranges.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Using formally correct sensitivity bounds instead of observed min/max would likely cause even greater utility loss.
  • New privacy mechanisms specifically designed for survival models may be needed to achieve usable privacy-utility tradeoffs.
  • Clinical studies relying on shared survival data might need to accept weaker privacy guarantees or larger sample sizes to retain statistical power.

Load-bearing premise

The clipping bounds derived from observed data minima and maxima do not enforce formal differential privacy, making the measured utility loss an optimistic lower bound.

What would settle it

Re-running the simulations with proper formal ε-DP mechanisms that compute sensitivities without using observed data bounds and checking whether significance loss remains near 90% at ε=1.

Figures

Figures reproduced from arXiv: 2604.21491 by Hiroaki Kikuchi, Keita Fukuyama, Tomohiro Kuroda, Yukiko Mori.

Figure 1
Figure 1. Figure 1: ε-dependence of the loss of significance rate (LSR). Three input perturbation approaches (Phase 1–3) compared across five datasets. Dashed lines indicate ε = 3 and ε = 10. Phase 1 (covariates only) achieves the best recovery across all ε ranges; Phase 2 (all inputs) is inferior due to risk-set destruction; Phase 3 (discrete-time) shows no improvement from discretization under the same budget structure. Lun… view at source ↗
Figure 2
Figure 2. Figure 2: ε-dependence of the test C-index. Four methods (Phase 1–3, output perturbation dfbeta) compared across five datasets. Ribbons indicate ±1 standard deviation (SD) (B = 1,000 train/test splits). Triangle markers at the right end indicate the baseline under the non-DP condition (ε = ∞). Gray solid lines indicate the baseline C-index; dashed lines indicate random prediction (0.5). D. False Positive Rate (FPR) … view at source ↗
Figure 3
Figure 3. Figure 3: ε-dependence of the mean false positive rate (FPR) for three input perturbation methods (Phase 1–3) across five datasets. The dashed line indicates the nominal α = 0.05. At low ε, FPR ≈ α, but it increases at moderate-to-high ε (10–100) and converges to 0 as ε → ∞. TABLE III PHASE 1 LSR PROGRESSION IN THE TRANSITION ZONE (ε = 3–10). Dataset ε = 3 ε = 5 ε = 7 ε = 10 lung 0.928 0.896 0.868 0.808 pbc 0.935 0.… view at source ↗
Figure 4
Figure 4. Figure 4: Variable-level FPR for the PBC dataset ( [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Hazard ratio (HR) distributions for the top five covariates in each dataset under Phase 1 (covariate-only perturbation). Dots indicate the mean HR [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: HR relative bias for the top five covariates under Phase 1 ( [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗
read the original abstract

Differential privacy (DP) is a mathematical framework that guarantees individual privacy; however, systematic evaluation of its impact on statistical utility in survival analyses remains limited. In this study, we systematically evaluated the impact of DP mechanisms (Laplace mechanism and Randomized Response) with data-driven clipping bounds on the Cox proportional hazards model, using 5 clinical datasets ($n = 168$--$6{,}524$), 15 levels of $\varepsilon$ (0.1--1000), and $B = 1{,}000$ Monte Carlo iterations. The data-driven clipping bounds used here are observed min/max and therefore do not provide formal $\varepsilon$-DP guarantees; the results represent an optimistic lower bound on utility degradation under formal DP. We compared three types of input perturbations (covariates only, all inputs, and the discrete-time model) with output perturbations (dfbeta-based sensitivity), using loss of significance rate (LSR), C-index, and coefficient bias as metrics. At standard DP levels ($\varepsilon \leq 1$), approximately 90% (90--94%) of the significant covariates lost significance, even in the largest dataset ($n = 6{,}524$), and the predictive performance approached random levels (test C-index $\approx 0.5$) under many conditions. Among the input perturbation approaches, perturbing only covariates preserved the risk-set structure and achieved the best recovery, whereas output perturbation (dfbeta-based sensitivity) maintained near-baseline performance at $\varepsilon \geq 5$. At $n \approx 3{,}000$, the significance recovered rapidly at $\varepsilon = 3$--10; however, in practice, $\varepsilon \geq 10$ (for predictive performance) to $\varepsilon \geq 30$--60 (for significance preservation) is required. In the moderate-to-high $\varepsilon$ range, false-positive rates increased for variables whose baseline $p$-values were near the significance threshold.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 1 minor

Summary. The manuscript describes an empirical evaluation of differential privacy applied to the Cox proportional hazards model using data-driven (observed min/max) clipping bounds on five real clinical datasets with sample sizes from 168 to 6,524. Through 1,000 Monte Carlo iterations for each of 15 ε values and three perturbation types (covariates-only, all-inputs, discrete-time) plus output perturbation, it reports metrics including loss of significance rate (LSR), C-index, and bias. The central results are that at ε ≤ 1, LSR is 90-94% even in the largest dataset, C-index approaches 0.5, covariate perturbation preserves risk-set best, and significance recovers at ε = 3-10 for n~3000, with the authors correctly noting these are optimistic estimates as the bounds do not ensure formal ε-DP.

Significance. If these simulation results hold, the paper makes a valuable contribution by providing concrete, multi-dataset evidence of the utility-privacy trade-off in privacy-preserving survival analysis. This is particularly important for the clinical domain where Cox models are prevalent, and the extensive simulation design with real data lends credibility to the conclusion that higher privacy budgets are needed for usable models.

minor comments (1)
  1. [Abstract] The abstract mentions the dataset size range but would benefit from a brief summary table of key dataset characteristics (e.g., number of events, covariates, and event rates) to provide immediate context for the results.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary of our simulation study and the recommendation for minor revision. The assessment accurately captures the central findings on utility degradation under data-driven clipping at typical DP levels.

Circularity Check

0 steps flagged

No circularity; purely empirical Monte Carlo benchmarking

full rationale

The paper is a simulation study that applies Laplace and Randomized Response mechanisms to Cox models across five datasets, computes LSR, C-index and bias over B=1000 iterations, and reports direct empirical outcomes. No derivations, fitted parameters renamed as predictions, or equations appear; the authors explicitly flag that observed min/max clipping yields no formal DP and treat all numbers as optimistic lower bounds. Central claims rest on the simulation outputs themselves rather than any self-referential reduction or self-citation chain.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is an empirical benchmarking study relying on standard statistical methods and simulation; no new axioms, parameters fitted to derive a claim, or invented entities.

pith-pipeline@v0.9.0 · 5685 in / 1251 out tokens · 56368 ms · 2026-05-09T21:30:13.989914+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

23 extracted references · 1 canonical work pages · 1 internal anchor

  1. [1]

    Calibrating noise to sensitivity in private data analysis,

    C. Dwork, F. McSherry, K. Nissim, and A. Smith, “Calibrating noise to sensitivity in private data analysis,” inTheory of Cryptography Conference. Springer, 2006, pp. 265–284

  2. [2]

    The algorithmic foundations of differential privacy,

    C. Dwork and A. Roth, “The algorithmic foundations of differential privacy,”Foundations and Trends in Theoretical Computer Science, vol. 9, no. 3–4, pp. 211–407, 2014

  3. [3]

    Regression models and life-tables,

    D. R. Cox, “Regression models and life-tables,”Journal of the Royal Statistical Society: Series B (Methodological), vol. 34, no. 2, pp. 187– 202, 1972

  4. [4]

    Differentially private regression for discrete-time survival analysis,

    T. T. Nguyen and S. C. Hui, “Differentially private regression for discrete-time survival analysis,” inProceedings of the 2017 ACM on Conference on Information and Knowledge Management (CIKM). ACM, 2017, pp. 1199–1208

  5. [5]

    Protecting patient privacy in survival analyses,

    L. Bonomi, X. Jiang, and L. Ohno-Machado, “Protecting patient privacy in survival analyses,”Journal of the American Medical Informatics Association, vol. 27, no. 3, pp. 366–375, 2020

  6. [6]

    Nonparametric estimation from incomplete observations,

    E. L. Kaplan and P. Meier, “Nonparametric estimation from incomplete observations,”Journal of the American Statistical Association, vol. 53, no. 282, pp. 457–481, 1958

  7. [7]

    Differentially private survival function estimation,

    L. Gondara and K. Wang, “Differentially private survival function estimation,” inMachine Learning for Healthcare, ser. Proceedings of Machine Learning Research, vol. 126. PMLR, 2020, pp. 1–20. [Online]. Available: https://proceedings.mlr.press/v126/gondara20a.html

  8. [8]

    “dead or alive, we can deny it

    F. L. De Faveri, G. Faggioli, N. Ferro, and R. Spizzo, ““dead or alive, we can deny it”. A differentially private approach to survival analysis,” in32nd Symposium on Advanced Database Systems (SEBD 2024), Villasimius, Sardinia, Italy, 2024

  9. [9]

    Practical challenges in differentially-private federated survival analysis of medical data,

    S. Rahimian, R. Kerkouche, I. Kurth, and M. Fritz, “Practical challenges in differentially-private federated survival analysis of medical data,” inConference on Health, Inference, and Learning (CHIL), ser. Proceedings of Machine Learning Research, vol. 174. PMLR, 2022, pp. 411–425. [Online]. Available: https://proceedings.mlr.press/v174/ rahimian22a.html

  10. [10]

    Optimal Cox regression under federated differential privacy: coefficients and cumulative hazards

    E. K. H. Hung and Y . Yu, “Optimal Cox regression under federated differential privacy: Coefficients and cumulative hazards,”arXiv preprint arXiv:2508.19640, 2025

  11. [11]

    Differentially private empirical risk minimization,

    K. Chaudhuri, C. Monteleoni, and A. D. Sarwate, “Differentially private empirical risk minimization,”Journal of Machine Learning Research, vol. 12, pp. 1069–1109, 2011. [Online]. Available: https://jmlr.org/papers/v12/chaudhuri11a.html

  12. [12]

    Functional mechanism: Regression analysis under differential privacy,

    J. Zhang, Z. Zhang, X. Xiao, Y . Yang, and M. Winslett, “Functional mechanism: Regression analysis under differential privacy,”Proceedings of the VLDB Endowment, vol. 5, no. 11, pp. 1364–1375, 2012. IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS 11

  13. [13]

    A statistical framework for differential privacy,

    L. Wasserman and S. Zhou, “A statistical framework for differential privacy,”Journal of the American Statistical Association, vol. 105, no. 489, pp. 375–389, 2010

  14. [14]

    Hypothesis testing interpretations and R ´enyi differential privacy,

    B. Balle, G. Barthe, M. Gaboardi, J. Hsu, and T. Sato, “Hypothesis testing interpretations and R ´enyi differential privacy,” inInternational Conference on Artificial Intelligence and Statistics. PMLR, 2020, pp. 2496–2506. [Online]. Available: https://proceedings.mlr.press/v108/ balle20a.html

  15. [15]

    Randomized response: A survey technique for eliminating evasive answer bias,

    S. L. Warner, “Randomized response: A survey technique for eliminating evasive answer bias,”Journal of the American Statistical Association, vol. 60, no. 309, pp. 63–69, 1965

  16. [16]

    Local privacy and statistical minimax rates,

    J. C. Duchi, M. I. Jordan, and M. J. Wainwright, “Local privacy and statistical minimax rates,” in2013 IEEE 54th Annual Symposium on Foundations of Computer Science. IEEE, 2013, pp. 429–438

  17. [17]

    T. M. Therneau,A Package for Survival Analysis in R, 2024, r package version 3.7-0. [Online]. Available: https://CRAN.R-project. org/package=survival

  18. [18]

    T. M. Therneau and P. M. Grambsch,Modeling Survival Data: Extending the Cox Model. New York: Springer, 2000

  19. [19]

    It’s about time: Using discrete-time survival analysis to study duration and the timing of events,

    J. D. Singer and J. B. Willett, “It’s about time: Using discrete-time survival analysis to study duration and the timing of events,”Journal of Educational Statistics, vol. 18, no. 2, pp. 155–195, 1993

  20. [20]

    Discrete-time methods for the analysis of event histories,

    P. D. Allison, “Discrete-time methods for the analysis of event histories,” Sociological Methodology, vol. 13, pp. 61–98, 1982

  21. [21]

    Evaluating the yield of medical tests,

    F. E. Harrell, Jr, R. M. Califf, D. B. Pryor, K. L. Lee, and R. A. Rosati, “Evaluating the yield of medical tests,”JAMA, vol. 247, no. 18, pp. 2543–2546, 1982

  22. [22]

    dsSurvival: Privacy preserving survival models for federated individual patient meta-analysis in DataSHIELD,

    S. Banerjee, G. N. Sofack, T. Papakonstantinou, D. Avraam, P. Burton, D. Z ¨oller, and T. R. P. Bishop, “dsSurvival: Privacy preserving survival models for federated individual patient meta-analysis in DataSHIELD,” BMC Research Notes, vol. 15, no. 1, p. 197, 2022

  23. [23]

    Smooth sensitivity and sampling in private data analysis,

    K. Nissim, S. Raskhodnikova, and A. Smith, “Smooth sensitivity and sampling in private data analysis,” inProceedings of the 39th Annual ACM Symposium on Theory of Computing (STOC). ACM, 2007, pp. 75–84