Recognition: 1 theorem link
· Lean TheoremMulti-user Pufferfish Privacy
Pith reviewed 2026-05-16 20:46 UTC · model grok-4.3
The pith
Sufficient conditions from Wasserstein metrics enable Laplace noise calibration for Pufferfish privacy in multi-user aggregated queries.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Sufficient conditions are derived for attaining statistical indistinguishability on four sets of secret pairs in all scenarios of user data changes using the Kantorovich method with the Wasserstein metric of order 1. These can be applied to user addition or removal in tabular data, revealing that individual indifference is conditioned on the user's statistics only. For binary Bernoulli distributed random variables, the conditions can be relaxed to reduce noise and improve utility.
What carries the argument
Laplace noise calibration via the Kantorovich method and Wasserstein metric of order 1 to enforce Pufferfish privacy on secret pairs.
If this is right
- The noise calibration works for scenarios where users leave and are replaced.
- Privacy protection applies when adding or removing users from tabular data.
- Indistinguishability depends solely on the statistics of the user in question.
- Relaxed conditions for binary data reduce required noise levels.
Where Pith is reading between the lines
- This calibration technique might apply to other privacy definitions using similar distance metrics.
- In practice, it could lead to more efficient privacy mechanisms in dynamic user databases.
- Further relaxation might be possible for other discrete distributions beyond binary.
Load-bearing premise
The random variables reported by users have distributions that allow the Wasserstein metric to directly bound the privacy loss for the considered secret pairs.
What would settle it
If computing the Wasserstein distance between the distributions of secret pairs does not yield the expected bound on the privacy parameter when Laplace noise is added, the sufficient conditions would not hold.
Figures
read the original abstract
This paper studies how to achieve individual indistinguishability by pufferfish privacy in aggregated query to a multi-user system. It is assumed that each user reports realization of a random variable. We study how to calibrate Laplace noise, added to the query answer, to attain pufferfish privacy when user changes his/her reported data value, leaves the system and is replaced by another use with different randomness. Sufficient conditions are derived for all scenarios for attaining statistical indistinguishability on four sets of secret pairs. They are derived using the existing Kantorovich method (Wasserstain metric of order $1$). These results can be applied to attain indistinguishability when a certain class of users is added or removed from a tabular data. It is revealed that attaining indifference in individual's data is conditioned on the statistics of this user only. For binary (Bernoulli distributed) random variables, the derived sufficient conditions can be further relaxed to reduce the noise and improve data utility.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper studies achieving individual indistinguishability via Pufferfish privacy for aggregated queries in multi-user systems. Assuming each user reports a realization of a random variable, it calibrates Laplace noise to attain privacy when a user changes data, leaves, or is replaced. Sufficient conditions are derived using the Kantorovich method (Wasserstein metric of order 1) for four sets of secret pairs. These apply to adding/removing users from tabular data, with relaxation for Bernoulli variables to reduce noise.
Significance. If the derived conditions hold and the independence assumptions are verified, this provides a practical method for calibrating noise in dynamic multi-user aggregates to achieve Pufferfish privacy based solely on individual user statistics. It extends existing Wasserstein-based approaches to user replacement scenarios and offers utility improvements for binary data, which is relevant for applications in collaborative sensing or data sharing where user sets change.
major comments (2)
- Replacement scenario: The claim that statistical indifference depends only on the statistics of the replaced user requires that reports are independent and the aggregate query is linear (e.g., sum). The manuscript does not explicitly verify that the Kantorovich transport plan remains valid under the multi-user joint distribution, which could make the Wasserstein-1 distance depend on the full joint law rather than the marginal alone.
- Sufficient conditions section: The abstract states that conditions are derived for all scenarios using the existing Kantorovich method, but no equation or derivation is shown to confirm that the resulting bounds support the indistinguishability claims without gaps when reports may be correlated.
minor comments (1)
- The abstract contains a typo: 'Wasserstain' should be 'Wasserstein'.
Simulated Author's Rebuttal
We thank the referee for the careful review and constructive feedback. We address each major comment below and indicate the revisions planned for the manuscript.
read point-by-point responses
-
Referee: Replacement scenario: The claim that statistical indifference depends only on the statistics of the replaced user requires that reports are independent and the aggregate query is linear (e.g., sum). The manuscript does not explicitly verify that the Kantorovich transport plan remains valid under the multi-user joint distribution, which could make the Wasserstein-1 distance depend on the full joint law rather than the marginal alone.
Authors: We agree that the replacement scenario analysis relies on the assumptions of independent user reports and a linear aggregate query (e.g., summation). Under these conditions, the joint distribution factors such that the Wasserstein-1 distance between the aggregates depends only on the marginal distribution of the replaced user's report, allowing the Kantorovich transport plan to be constructed from individual statistics alone. We will add an explicit statement of these assumptions together with a brief verification of the transport plan reduction in the revised manuscript. revision: yes
-
Referee: Sufficient conditions section: The abstract states that conditions are derived for all scenarios using the existing Kantorovich method, but no equation or derivation is shown to confirm that the resulting bounds support the indistinguishability claims without gaps when reports may be correlated.
Authors: The sufficient conditions and Kantorovich derivations appear in Section 3 for the four secret-pair sets across all scenarios. These derivations implicitly rely on report independence, and we did not explicitly address correlated reports or include equations confirming the bounds in that case. We will revise the section to insert the intermediate equations showing the reduction to marginals under independence and to state clearly that the results assume independence (with a note that correlations fall outside the current scope). revision: yes
Circularity Check
No circularity: applies external Kantorovich/Wasserstein-1 method to derive conditions for new multi-user scenarios
full rationale
The paper derives sufficient conditions for pufferfish privacy under user addition/removal/replacement by directly invoking the existing Kantorovich duality for the Wasserstein-1 metric on the aggregate query output. This is an application of a standard external tool to four specific secret-pair sets; the resulting bounds are not obtained by fitting parameters to the target indistinguishability statement itself, nor by renaming a known result, nor by load-bearing self-citation. The derivation chain therefore remains self-contained against external benchmarks and receives the default non-circularity finding.
Axiom & Free-Parameter Ledger
free parameters (1)
- Laplace noise scale
axioms (2)
- domain assumption Each user reports a realization of a random variable
- domain assumption Privacy is defined over four specific sets of secret pairs
Forward citations
Cited by 2 Pith papers
-
$\alpha$-Wasserstein Mechanism for R\'{e}nyi Pufferfish Privacy
The α-Wasserstein mechanism calibrates noise for exact Rényi Pufferfish Privacy by bounding the Wasserstein metric, generalizing the W_∞ pufferfish mechanism and Rényi differential privacy results.
-
R\'enyi Pufferfish Privacy with Gaussian-based Priors: From Single Gaussian to Mixture Model
Gaussian mechanisms for Rényi Pufferfish Privacy under Gaussian and mixture priors deliver exact divergence derivations, closed-form sufficient conditions, and 48.9% less noise than additive baselines on statistical a...
Reference graph
Works this paper leans on
-
[1]
Alvim, Kostas Chatzikokolakis, Catuscia Palamidessi, and Geoffrey Smith
M’rio S. Alvim, Kostas Chatzikokolakis, Catuscia Palamidessi, and Geoffrey Smith. 2012. Measuring Information Leakage Using Generalized Gain Functions. In2012 IEEE 25th Computer Security Foundations Symposium. IEEE, 265–279. https://doi.org/10.1109/csf.2012.26 12
-
[2]
Arthur Asuncion and David Newman. 2007. UCI machine learning repository https://archive.ics.uci.edu/ml/index.php. https://archive.ics.uci.edu/ml/index. php
work page 2007
-
[3]
R. P. Brent. 1971. An algorithm with guaranteed convergence for finding a zero of a function.Comput. J.14, 4 (April 1971), 422–425. https://doi.org/10.1093/ comjnl/14.4.422
work page 1971
-
[4]
Thierry Champion, Luigi De Pascale, and Petri Juutinen. 2008. The ∞- Wasserstein Distance: Local Solutions and Existence of Optimal Transport Maps. SIAM Journal on Mathematical Analysis40, 1 (2008), 1–20
work page 2008
-
[5]
Michael R. Clarkson, Andrew C. Myers, and Fred B. Schneider. 2009. Quantifying information flow with beliefs.Journal of Computer Security17, 5 (Oct. 2009), 655–701. https://doi.org/10.3233/jcs-2009-0353
-
[6]
Henry Corrigan-Gibbs and Dan Boneh. 2017. Prio: Private, Robust, and Scalable Computation of Aggregate Statistics. In14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17). USENIX Association, Boston, MA, 259–282. https://www.usenix.org/conference/nsdi17/technical-sessions/ presentation/corrigan-gibbs
work page 2017
-
[7]
Paul Cuff and Lanqing Yu. 2016. Differential Privacy as a Mutual Information Constraint. InProceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security (CCS’16). ACM, 43–54. https://doi.org/10.1145/2976749. 2978308
-
[8]
Luigi De Pascale and Jean Louet. 2019. A study of the dual problem of the one-dimensional 𝐿∞-optimal transport problem with applications.Journal of Functional Analysis276, 11 (2019), 3304–3324
work page 2019
-
[9]
Ni Ding. 2022. Kantorovich Mechanism for Pufferfish Privacy. InProceed- ings of The 25th International Conference on Artificial Intelligence and Statis- tics (Proceedings of Machine Learning Research), Gustau Camps-Valls, Fran- cisco J. R. Ruiz, and Isabel Valera (Eds.), Vol. 151. PMLR, 5084–5103. https: //proceedings.mlr.press/v151/ding22b.html
work page 2022
-
[10]
Ni Ding. 2024. Approximation of Pufferfish Privacy for Gaussian Priors.IEEE Transactions on Information Forensics and Security19 (2024), 5630–5640. https: //doi.org/10.1109/tifs.2024.3402104
-
[11]
Ni Ding, Yucheng Liu, and Farhad Farokhi. 2021. A Linear Reduction Method for Local Differential Privacy and Log-lift. In2021 IEEE International Symposium on Information Theory (ISIT). Melbourne, 551–556
work page 2021
-
[12]
Jinshuo Dong, Aaron Roth, and Weijie J. Su. 2022. Gaussian Differential Privacy. Journal of the Royal Statistical Society Series B: Statistical Methodology84, 1 (Feb. 2022), 3–37. https://doi.org/10.1111/rssb.12454
-
[13]
Flávio du Pin Calmon and Nadia Fawaz. 2012. Privacy against statistical infer- ence. In2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton). 1401–1408. https://doi.org/10.1109/Allerton.2012.6483382
-
[14]
John C. Duchi, Michael I. Jordan, and Martin J. Wainwright. 2013. Local Pri- vacy and Statistical Minimax Rates. In2013 IEEE 54th Annual Symposium on Foundations of Computer Science. 429–438. https://doi.org/10.1109/FOCS.2013.53
-
[15]
Cynthia Dwork. 2006. Differential Privacy. InAutomata, Languages and Pro- gramming, Michele Bugliesi, Bart Preneel, Vladimiro Sassone, and Ingo Wegener (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 1–12
work page 2006
-
[16]
Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. 2006. Cali- brating Noise to Sensitivity in Private Data Analysis. InTheory of Cryptography, Shai Halevi and Tal Rabin (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 265–284
work page 2006
-
[17]
Cynthia Dwork, Aaron Roth, et al. 2014. The algorithmic foundations of differ- ential privacy.Found. Trends Theor. Comput. Sci.9, 3-4 (2014), 211–407
work page 2014
-
[18]
I. Issa, A. B. Wagner, and S. Kamath. 2020. An Operational Approach to Infor- mation Leakage.IEEE Transactions on Information Theory66, 3 (March 2020), 1625–1657
work page 2020
-
[19]
Daniel Kifer and Ashwin Machanavajjhala. 2011. No free lunch in data privacy. InProceedings of the 2011 ACM SIGMOD International Conference on Management of data (SIGMOD/PODS ’11). ACM, 193–204. https://doi.org/10.1145/1989323. 1989345
-
[20]
Daniel Kifer and Ashwin Machanavajjhala. 2012. A Rigorous and Customizable Framework for Privacy. InProceedings of the 31st ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems(Scottsdale, Arizona, USA)(PODS ’12). Association for Computing Machinery, New York, NY, USA, 77–88
work page 2012
-
[21]
Daniel Kifer and Ashwin Machanavajjhala. 2014. Pufferfish: A Framework for Mathematical Privacy Definitions.ACM Transactions on Database Systems39, 1, Article 3 (Jan. 2014), 36 pages. https://doi.org/10.1145/2514689
-
[22]
Jakub Konečn`y, H Brendan McMahan, Daniel Ramage, and Peter Richtárik. 2016. Federated optimization: Distributed machine learning for on-device intelligence. arXiv preprint arXiv:1610.02527(2016)
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[23]
Jakub Konečn `y, H Brendan McMahan, Felix X Yu, Peter Richtárik, Ananda Theertha Suresh, and Dave Bacon. 2016. Federated learning: Strategies for improving communication efficiency.arXiv preprint arXiv:1610.05492(2016)
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[24]
Gerty J. L. M. Lensvelt-Mulders, Joop J. Hox, Peter G. M. van der Heijden, and Cora J. M. Maas. 2005. Meta-Analysis of Randomized Response Research: Thirty- Five Years of Validation.Sociological Methods Research33, 3 (Feb. 2005), 319–348. https://doi.org/10.1177/0049124104268664
-
[25]
Jiachun Liao, Oliver Kosut, Lalitha Sankar, and Flavio du Pin Calmon. 2019. Tunable Measures for Information Leakage and Applications to Privacy-Utility Tradeoffs.IEEE Transactions on Information Theory65, 12 (Dec. 2019), 8043–8066. https://doi.org/10.1109/tit.2019.2935768
-
[26]
Ali Makhdoumi, Salman Salamatian, Nadia Fawaz, and Muriel Médard. 2014. From the Information Bottleneck to the Privacy Funnel. In2014 IEEE Information Theory Workshop (ITW 2014). 501–505. https://doi.org/10.1109/ITW.2014.6970882
-
[27]
I. Mironov. 2017. Rényi Differential Privacy. In2017 IEEE 30th Computer Security Foundations Symposium (CSF). 263–275
work page 2017
-
[28]
Theshani Nuradha and Ziv Goldfeld. 2023. Pufferfish Privacy: An Information- Theoretic Study.IEEE Transactions on Information Theory69, 11 (Nov. 2023), 7336–7356. https://doi.org/10.1109/tit.2023.3296288
-
[29]
Clément Pierquin, Aurélien Bellet, Marc Tommasi, and Matthieu Boussard
-
[30]
InInterna- tional Conference on Machine Learning (ICML 2024)
Rényi Pufferfish Privacy: General Additive Noise Mechanisms and Privacy Amplification by Iteration via Shift Reduction Lemmas. InInterna- tional Conference on Machine Learning (ICML 2024). Vienna (Austria), Austria. https://inria.hal.science/hal-04363020
work page 2024
-
[31]
Filippo Santambrogio. 2015. Optimal transport for applied mathematicians. Birkäuser, NY55, 58-63 (2015), 94
work page 2015
-
[32]
2009.On the Foundations of Quantitative Information Flow
Geoffrey Smith. 2009.On the Foundations of Quantitative Information Flow. Springer Berlin Heidelberg, 288–302. https://doi.org/10.1007/978-3-642-00596- 1_21
-
[33]
Shuang Song, Yizhen Wang, and Kamalika Chaudhuri. 2017. Pufferfish Privacy Mechanisms for Correlated Data. InProceedings of the 2017 ACM International Conference on Management of Data(Chicago, Illinois, USA). New York, NY, USA, 1291–1306
work page 2017
-
[34]
2003.An introduction to numerical analysis
Endre Süli and David F Mayers. 2003.An introduction to numerical analysis. Cambridge university press
work page 2003
-
[35]
2009.Optimal transport: old and new
Cédric Villani. 2009.Optimal transport: old and new. Vol. 338. Springer
work page 2009
-
[36]
Larry Wasserman and Shuheng Zhou. 2010. A Statistical Framework for Differential Privacy.J. Amer. Statist. Assoc.105, 489 (March 2010), 375–389. https://doi.org/10.1198/jasa.2009.tm08651
- [37]
- [38]
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.