Data-dependent PAC-Bayes priors via differential privacy

arxiv: 1802.09583 · v2 · pith:CN43XCBNnew · submitted 2018-02-26 · 💻 cs.LG · stat.ML

Data-dependent PAC-Bayes priors via differential privacy

Gintare Karolina Dziugaite , Daniel M. Roy This is my paper

classification 💻 cs.LG stat.ML

keywords boundsdata-dependentdistributionpac-bayespriorsbounddatadifferentially

0 comments p. Extension

pith:CN43XCBN Add to your LaTeX paper

What is a Pith Number?

\usepackage{pith}
\pithnumber{CN43XCBN}

Prints a linked pith:CN43XCBN badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more

read the original abstract

The Probably Approximately Correct (PAC) Bayes framework (McAllester, 1999) can incorporate knowledge about the learning algorithm and (data) distribution through the use of distribution-dependent priors, yielding tighter generalization bounds on data-dependent posteriors. Using this flexibility, however, is difficult, especially when the data distribution is presumed to be unknown. We show how an {\epsilon}-differentially private data-dependent prior yields a valid PAC-Bayes bound, and then show how non-private mechanisms for choosing priors can also yield generalization bounds. As an application of this result, we show that a Gaussian prior mean chosen via stochastic gradient Langevin dynamics (SGLD; Welling and Teh, 2011) leads to a valid PAC-Bayes bound given control of the 2-Wasserstein distance to an {\epsilon}-differentially private stationary distribution. We study our data-dependent bounds empirically, and show that they can be nonvacuous even when other distribution-dependent bounds are vacuous.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Bounded-Rationality, Hedging, and Generalization
cs.LG 2026-05 unverdicted novelty 7.0

Generalization is a testable hedging property of the learner's response law, recovered via f-divergence regularizers that induce information-geometric curves between training loss and sample dependence.
Margin-Adaptive Confidence Ranking for Reliable LLM Judgement
cs.LG 2026-05 unverdicted novelty 5.0

Introduces a margin-adaptive confidence ranking method that learns an estimator from simulated diversity and derives margin-dependent generalization bounds for use in fixed-sequence testing of LLM-human agreement.