pith. machine review for the scientific record. sign in

arxiv: 2510.20616 · v2 · submitted 2025-10-23 · 💻 cs.LG

Recognition: unknown

On Optimal Hyperparameters for Differentially Private Deep Transfer Learning

Authors on Pith no claims yet
classification 💻 cs.LG
keywords privacybetterprivateclippingcomputecumulativecurrentdifferentially
0
0 comments X
read the original abstract

Differentially private (DP) transfer learning, i.e., fine-tuning a pretrained model on private data, is the current state-of-the-art approach for training large models under privacy constraints. We focus on two key hyperparameters in this setting: the clipping bound $C$ and batch size $B$. We show a clear mismatch between the current theoretical understanding of how to choose an optimal $C$ (stronger privacy requires smaller $C$) and empirical outcomes (larger $C$ performs better under strong privacy), caused by changes in the gradient distributions. Assuming a limited compute budget (fixed epochs), we demonstrate that the existing heuristics for tuning $B$ do not work, while cumulative DP noise better explains whether smaller or larger batches perform better. We also highlight how the common practice of using a single $(C,B)$ setting across tasks can lead to suboptimal performance. We find that performance drops especially when moving between loose and tight privacy and between plentiful and limited compute, which we explain by analyzing clipping as a form of gradient re-weighting and examining cumulative DP noise.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. DPrivBench: Benchmarking LLMs' Reasoning for Differential Privacy

    cs.LG 2026-04 unverdicted novelty 7.0

    DPrivBench shows that top LLMs handle basic differential privacy mechanisms but fail on advanced algorithms, exposing gaps in automated DP reasoning.