Tight Auditing of Differential Privacy in MST and AIM
Pith reviewed 2026-05-10 04:30 UTC · model grok-4.3
The pith
A GDP-based auditing framework delivers the first tight privacy measurements for MST and AIM in the strong-privacy regime.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors establish that applying a GDP-based auditing framework, which measures privacy through the complete tradeoff curve, to MST and AIM under worst-case settings produces tight empirical privacy parameters that closely track the theoretical bounds, as shown by mu_emp approximately 0.43 versus implied mu of 0.45 for (epsilon, delta) equal to (1, 10 to the minus 2).
What carries the argument
Gaussian Differential Privacy (GDP) auditing framework that computes the full false-positive/false-negative privacy tradeoff curve.
Load-bearing premise
The GDP framework applied to worst-case settings of MST and AIM accurately measures their actual privacy guarantees.
What would settle it
Repeated empirical audits producing an mu value substantially larger or smaller than the implied theoretical mu, such as 0.60 instead of 0.45 for the (1, 10^{-2}) parameters, would show the audits are not tight.
Figures
read the original abstract
State-of-the-art Differentially Private (DP) synthetic data generators such as MST and AIM are widely used, yet tightly auditing their privacy guarantees remains challenging. We introduce a Gaussian Differential Privacy (GDP)-based auditing framework that measures privacy via the full false-positive/false-negative tradeoff. Applied to MST and AIM under worst-case settings, our method provides the first tight audits in the strong-privacy regime. For $(\epsilon,\delta)=(1,10^{-2})$, we obtain $\mu_{emp}\approx0.43$ vs. implied $\mu=0.45$, showing a small theory-practice gap. Our code is publicly available: https://github.com/sassoftware/dpmm.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces a Gaussian Differential Privacy (GDP)-based auditing framework for measuring the privacy guarantees of MST and AIM, two state-of-the-art differentially private synthetic data generators. The framework evaluates privacy via the full false-positive/false-negative tradeoff curve and is applied under worst-case settings to deliver tight audits in the strong-privacy regime. For the parameter pair (ε,δ)=(1,10^{-2}), it reports an empirical μ_emp≈0.43 against an implied theoretical μ=0.45, indicating a small theory-practice gap. Publicly available code is provided at https://github.com/sassoftware/dpmm.
Significance. If the auditing framework and empirical results hold, the work is significant for the DP community because it supplies the first tight, GDP-based audits of widely deployed synthetic data mechanisms in the strong-privacy regime, where prior methods were loose or inapplicable. The public release of the implementation code is a clear strength that supports reproducibility and enables independent verification or extension.
minor comments (3)
- The abstract states the central empirical result (μ_emp≈0.43 vs. implied μ=0.45) but does not indicate the number of trials, the exact worst-case data distribution used, or the precise GDP conversion formulas; adding these details to §3 or §4 would improve immediate clarity without altering the claims.
- Figure captions and axis labels should explicitly state whether the plotted curves are for the empirical audit or the theoretical GDP bound, and whether they are averaged over multiple runs, to avoid reader ambiguity.
- The manuscript would benefit from a short paragraph in the introduction or §2 contrasting the new GDP auditing approach with prior black-box or membership-inference audits of MST/AIM, including quantitative comparisons of tightness where available.
Simulated Author's Rebuttal
We thank the referee for their positive evaluation of our work and for recommending minor revision. We appreciate the recognition that our GDP-based auditing framework provides the first tight audits of MST and AIM in the strong-privacy regime, along with the value placed on the public code release.
Circularity Check
No significant circularity detected
full rationale
The paper introduces a GDP-based auditing framework that computes empirical privacy (μ_emp) via full ROC tradeoff on MST and AIM under worst-case settings, then compares it to the theoretically implied μ value. This comparison is external: the empirical measurement is obtained by applying the framework to the mechanisms' outputs, not by fitting parameters to the target μ or by re-deriving the same quantity from itself. No self-definitional loop, fitted-input-as-prediction, or load-bearing self-citation chain appears in the abstract or described method. The reported small gap (0.43 vs 0.45) constitutes an independent check rather than a tautology, and the derivation remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The Gaussian Differential Privacy model accurately reflects the privacy-utility tradeoff for auditing purposes in MST and AIM
Reference graph
Works this paper leans on
-
[1]
What do you want from theory alone?
Meenatchi Sundaram Muthu Selva Annamalai, Georgi Ganev, and Emiliano De Cristofaro. “What do you want from theory alone?” Experimenting with Tight Auditing of Differentially Private Synthetic Data Generation. InUSENIX Security, 2024
work page 2024
-
[2]
SoK: The Hitchhiker’s Guide to Efficient, End-to-End, and Tight DP Auditing
Meenatchi Sundaram Muthu Selva Annamalai, Borja Balle, Jamie Hayes, Georgios Kaissis, and Emiliano De Cristofaro. SoK: The Hitchhiker’s Guide to Efficient, End-to-End, and Tight DP Auditing. InIEEE SaTML, 2026
work page 2026
-
[3]
Three Variants of Differential Privacy: Lossless Conversion and Applications.IEEE JSAIT, 2021
Shahab Asoodeh, Jiachun Liao, Flavio P Calmon, Oliver Kosut, and Lalitha Sankar. Three Variants of Differential Privacy: Lossless Conversion and Applications.IEEE JSAIT, 2021
work page 2021
-
[4]
Concentrated Differential Privacy: Simplifications, Extensions, and Lower Bounds
Mark Bun and Thomas Steinke. Concentrated Differential Privacy: Simplifications, Extensions, and Lower Bounds. InTCC, 2016
work page 2016
-
[5]
The Discrete Gaussian for Differ- ential Privacy
Clément L Canonne, Gautam Kamath, and Thomas Steinke. The Discrete Gaussian for Differ- ential Privacy. InNeurIPS, 2020
work page 2020
-
[6]
Widespread Underestimation of Sensitivity in Differentially Private Libraries and How to Fix it
Sílvia Casacuberta, Michael Shoemate, Salil Vadhan, and Connor Wagaman. Widespread Underestimation of Sensitivity in Differentially Private Libraries and How to Fix it. InACM CCS, 2022
work page 2022
-
[7]
Benchmarking Differentially Private Tabular Data Synthesis.PACMMOD, 2025
Kai Chen, Xiaochen Li, Chen Gong, Ryan McKenna, and Tianhao Wang. Benchmarking Differentially Private Tabular Data Synthesis.PACMMOD, 2025
work page 2025
-
[8]
XGBoost: A Scalable Tree Boosting System
Tianqi Chen and Carlos Guestrin. XGBoost: A Scalable Tree Boosting System. InACM KDD, 2016
work page 2016
-
[9]
Synthetic Data: Methods, Use Cases, and Risks.IEEE S&P Magazine, 2024
Emiliano De Cristofaro. Synthetic Data: Methods, Use Cases, and Risks.IEEE S&P Magazine, 2024
work page 2024
-
[10]
Gaussian Differential Privacy.JRSSB, 2022
Jinshuo Dong, Aaron Roth, and Weijie J Su. Gaussian Differential Privacy.JRSSB, 2022
work page 2022
-
[11]
Cynthia Dwork and Aaron Roth. The Algorithmic Foundations of Differential Privacy.Founda- tions and Trends in Theoretical Computer Science, 2014
work page 2014
-
[12]
Our data, ourselves: Privacy via distributed noise generation
Cynthia Dwork, Krishnaram Kenthapadi, Frank McSherry, Ilya Mironov, and Moni Naor. Our data, ourselves: Privacy via distributed noise generation. InEuroCrypt, 2006
work page 2006
-
[13]
Calibrating Noise to Sensitivity in Private Data Analysis
Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. Calibrating Noise to Sensitivity in Private Data Analysis. InTCC, 2006
work page 2006
-
[14]
Robin Hood and Matthew Effects: Differential Privacy has Disparate Impact on Synthetic Data
Georgi Ganev, Bristena Oprisanu, and Emiliano De Cristofaro. Robin Hood and Matthew Effects: Differential Privacy has Disparate Impact on Synthetic Data. InICML, 2022
work page 2022
-
[15]
Georgi Ganev, Kai Xu, and Emiliano De Cristofaro. Graphical vs. Deep Generative Models: Measuring the Impact of Differentially Private Mechanisms and Budgets on Utility. InACM CCS, 2024
work page 2024
-
[16]
The Elusive Pursuit of Reproducing PATE-GAN: Benchmarking, Auditing, Debugging.TMLR, 2025
Georgi Ganev, Meenatchi Sundaram Muthu Selva Annamalai, and Emiliano De Cristofaro. The Elusive Pursuit of Reproducing PATE-GAN: Benchmarking, Auditing, Debugging.TMLR, 2025
work page 2025
-
[17]
Georgi Ganev, Meenatchi Sundaram Muthu Selva Annamalai, Sofiane Mahiou, and Emiliano De Cristofaro. The Importance of Being Discrete: Measuring the Impact of Discretization in End-to-End Differentially Private Synthetic Data. InACM CCS, 2025
work page 2025
-
[18]
Position: Gaussian DP for Reporting Differential Privacy Guarantees in Machine Learning
Juan Felipe Gomez, Bogdan Kulynych, Georgios Kaissis, Flavio P Calmon, Jamie Hayes, Borja Balle, and Antti Honkela. Position: Gaussian DP for Reporting Differential Privacy Guarantees in Machine Learning. InIEEE SaTML, 2026
work page 2026
-
[19]
Samuel Haney, Damien Desfontaines, Luke Hartman, Ruchit Shrestha, and Michael Hay. Precision-Based Attacks and Interval Refining: How to Break, then Fix, Differential Privacy on Finite Computers. InTPDP, 2022
work page 2022
-
[20]
Logan: Membership Inference Attacks against Generative Models
Jamie Hayes, Luca Melis, George Danezis, and Emiliano De Cristofaro. Logan: Membership Inference Attacks against Generative Models. InPoPETs, 2019
work page 2019
-
[21]
TAPAS: A Toolbox for Adversarial Privacy Auditing of Synthetic Data
Florimond Houssiau, James Jordon, Samuel N Cohen, Owen Daniel, Andrew Elliott, James Geddes, Callum Mole, Camila Rangel-Smith, and Lukasz Szpruch. TAPAS: A Toolbox for Adversarial Privacy Auditing of Synthetic Data. InNeurIPS Workshop on SyntheticData4ML, 2022. 5
work page 2022
-
[22]
Mimic-iii, a freely accessible critical care database.Scientific data, 3(1):1–9, 2016a
James Jordon, Lukasz Szpruch, Florimond Houssiau, Mirko Bottarelli, Giovanni Cherubin, Carsten Maple, Samuel N Cohen, and Adrian Weller. Synthetic Data–What, Why and How? arXiv:2205.03257, 2022
-
[23]
Johan Lokna, Anouk Paradis, Dimitar I. Dimitrov, and Martin Vechev. Group and Attack: Auditing Differential Privacy. InACM CCS, 2023
work page 2023
-
[24]
dpmm: Differentially Private Marginal Models, a Library for Synthetic Tabular Data Generation
Sofiane Mahiou, Amir Dizche, Reza Nazari, Xinmin Wu, Ralph Abbey, Jorge Silva, and Georgi Ganev. dpmm: Differentially Private Marginal Models, a Library for Synthetic Tabular Data Generation. InTPDP, 2025
work page 2025
-
[25]
private-pgm.https://github.com/ryan112358/private-pgm, 2019
Ryan McKenna. private-pgm.https://github.com/ryan112358/private-pgm, 2019
work page 2019
-
[26]
A Simple Recipe for Private Synthetic Data Generation
Ryan McKenna and Terrance Liu. A Simple Recipe for Private Synthetic Data Generation. DifferentialPrivacy.org, 2022.https://differentialprivacy.org/synth-data-1/
work page 2022
-
[27]
Graphical-Model Based Estimation and Inference for Differential Privacy
Ryan McKenna, Daniel Sheldon, and Gerome Miklau. Graphical-Model Based Estimation and Inference for Differential Privacy. InICML, 2019
work page 2019
-
[28]
Ryan McKenna, Gerome Miklau, and Daniel Sheldon. Winning the NIST Contest: A Scalable and General Approach to Differentially Private Synthetic Data.JPC, 2021
work page 2021
-
[29]
AIM: An Adaptive and Iterative Mechanism for Differentially Private Synthetic Data.PVLDB, 2022
Ryan McKenna, Brett Mullins, Daniel Sheldon, and Gerome Miklau. AIM: An Adaptive and Iterative Mechanism for Differentially Private Synthetic Data.PVLDB, 2022
work page 2022
-
[30]
Mechanism Design via Differential Privacy
Frank McSherry and Kunal Talwar. Mechanism Design via Differential Privacy. InFOCS, 2007
work page 2007
-
[31]
Adver- sary Instantiation: Lower Bounds for Differentially Private Machine Learning
Milad Nasr, Shuang Songi, Abhradeep Thakurta, Nicolas Papernot, and Nicholas Carlin. Adver- sary Instantiation: Lower Bounds for Differentially Private Machine Learning. InIEEE S&P, 2021
work page 2021
-
[32]
Tight Auditing of Differentially Private Machine Learning
Milad Nasr, Jamie Hayes, Thomas Steinke, Borja Balle, Florian Tramèr, Matthew Jagielski, Nicholas Carlini, and Andreas Terzis. Tight Auditing of Differentially Private Machine Learning. InUSENIX Security, 2023
work page 2023
-
[33]
2018 Differential Privacy Synthetic Data Challenge
NIST. 2018 Differential Privacy Synthetic Data Challenge. https://www.nist.gov/ ctl/pscr/open-innovation-prize-challenges/past-prize-challenges/2018- differential-privacy-synthetic, 2018
work page 2018
-
[34]
Synthesising the Linked 2011 Census and Deaths Dataset while Preserving its Confiden- tiality
ONS. Synthesising the Linked 2011 Census and Deaths Dataset while Preserving its Confiden- tiality. https://datasciencecampus.ons.gov.uk/synthesising-the-linked-2011- census-and-deaths-dataset-while-preserving-its-confidentiality/, 2023
work page 2011
-
[35]
SmartNoise SDK: Tools for Differential Privacy on Tabular Data
OpenDP. SmartNoise SDK: Tools for Differential Privacy on Tabular Data. https://github. com/opendp/smartnoise-sdk, 2021
work page 2021
-
[36]
Synthcity: A Benchmark Framework for Diverse Use Cases of Tabular Synthetic Data
Zhaozhi Qian, Rob Davis, and Mihaela van der Schaar. Synthcity: A Benchmark Framework for Diverse Use Cases of Tabular Synthetic Data. InNeurIPS Datasets and Benchmarks Track, 2023.https://github.com/vanderschaarlab/synthcity
work page 2023
-
[37]
On Measures of Entropy and Information
Alfréd Rényi. On Measures of Entropy and Information. InBerkeley Symposium on Mathemati- cal Statistics and Probability, 1961
work page 1961
-
[38]
Optimal Conversion from Rényi Differential Privacy to f-Differential Privacy
Anneliese Riess, Juan Felipe Gomez, Flavio du Pin Calmon, Julia Anne Schnabel, and Geor- gios Kaissis. Optimal Conversion from Rényi Differential Privacy to f-Differential Privacy. arXiv:2602.04562, 2026
work page internal anchor Pith review arXiv 2026
-
[39]
Membership Inference Attacks against Machine Learning Models
Reza Shokri, Marco Stronati, Congzheng Song, and Vitaly Shmatikov. Membership Inference Attacks against Machine Learning Models. InIEEE S&P, 2017
work page 2017
-
[40]
Privacy Auditing with One (1) Training Run
Thomas Steinke, Milad Nasr, and Matthew Jagielski. Privacy Auditing with One (1) Training Run. InNeurIPS, 2023
work page 2023
-
[41]
Benchmarking differentially private synthetic data generation algorithms
Yuchao Tao, Ryan McKenna, Michael Hay, Ashwin Machanavajjhala, and Gerome Miklau. Benchmarking differentially private synthetic data generation algorithms. InPPAI, 2022
work page 2022
-
[42]
Bayesian Estimation of Differential Privacy
Santiago Zanella-Béguelin, Lukas Wutschitz, Shruti Tople, Ahmed Salem, Victor Rühle, Andrew Paverd, Mohammad Naseri, Boris Köpf, and Daniel Jones. Bayesian Estimation of Differential Privacy. InICML, 2023. 6 Black-box Default Black-box Default Black-box Default 0.0 0.1 0.2 0.3 0.4emp 0.390 0.429 0.401 0.426 0.432 0.446 Theory (via -zCDP) Empirical audit: ...
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.