Beyond Epsilon: A Principled QIF Framework for Local Differential Privacy
Pith reviewed 2026-06-29 17:37 UTC · model grok-4.3
The pith
LDP mechanisms can be compared by whether one is less informative than another to every possible adversary via channel refinement.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Modeling each LDP frequency estimation protocol as a probabilistic channel and applying the refinement (Blackwell) order from quantitative information flow determines when one protocol is intrinsically superior because it produces less information leakage for every possible adversary and prior; this ordering reveals that some protocols previously considered optimal are in fact dominated by others or incomparable to them.
What carries the argument
Refinement (Blackwell ordering) on probabilistic channels that represent LDP mechanisms, which decides strict superiority by comparing leakage to all adversaries.
If this is right
- Some protocols with identical epsilon values are strictly superior to others under the refinement order for all adversaries.
- Utility-driven rankings that rely only on epsilon can conflict with adversary-aware superiority.
- The seven examined protocols receive a complete classification into comparable and incomparable pairs under refinement.
- The framework supplies a formal basis for choosing among LDP mechanisms that accounts for the full attacker model rather than only worst-case distinguishability.
Where Pith is reading between the lines
- The same channel-refinement technique could be applied to compare LDP protocols on tasks other than frequency estimation.
- Protocol designers could target mechanisms that achieve the lowest possible position in the refinement order rather than only meeting an epsilon bound.
- The approach connects LDP analysis directly to existing formal-methods tools for information-flow comparison.
Load-bearing premise
Modeling LDP mechanisms as probabilistic channels and using Blackwell ordering captures the relevant distinctions for which protocol is superior against all possible adversaries in frequency estimation.
What would settle it
An explicit adversary and input distribution for which a protocol declared dominated under refinement actually produces lower leakage than the protocol declared to dominate it.
Figures
read the original abstract
Local Differential Privacy (LDP) has become the de facto standard for privacy-preserving data collection in large-scale systems, in particular for the purpose of estimating frequencies. However, the current research landscape lacks a systematic and principled way to compare LDP protocols. The parameter $\varepsilon$ of LDP is considered the measure of privacy, but it only bounds worst-case distinguishability. Other comparisons rely on utility-driven analyses, where mechanisms are ranked based on their ability to preserve data utility for a given privacy budget $\varepsilon$. Both such kinds of comparisons fail to account for the strength of protocols against diverse attacker models. In this paper, we propose a framework for analyzing LDP frequency estimation protocols through the lens of Quantitative Information Flow (QIF). By modeling LDP mechanisms as probabilistic channels, we leverage the concept of refinement (Blackwell ordering) to establish more principled classifications. This approach allows us to determine when one protocol is intrinsically superior to another for all possible adversaries, and to discuss the implications for utility. In particular, our analysis uncovers cases where protocols previously deemed "optimal" are, in fact, incomparable with, or strictly dominated by, other protocols. We provide a formal QIF-based treatment of seven state-of-the-art protocols, including Generalized Randomized Response (GRR), local hashing variants (BLH, OLH), unary encoding schemes (SUE, OUE), and Thresholding with Histogram Encoding (THE). This perspective bridges the gap between the LDP and formal methods communities and enables principled, adversary-aware reasoning about locally private systems.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces a QIF-based framework for comparing LDP frequency-estimation protocols by modeling each as a probabilistic channel from private value to reported value and applying Blackwell refinement (ordering) to determine when one protocol is intrinsically superior to another against all possible adversaries. It applies this to seven protocols (GRR, BLH, OLH, SUE, OUE, THE) and claims to identify cases where previously optimal protocols are incomparable or dominated.
Significance. If the central claim holds after lifting the ordering to the multi-user product channel and the specific loss functions used in frequency estimation, the framework would supply a principled, adversary-independent classification that goes beyond ε-bounds or utility rankings and could correct misclassifications of optimality in the LDP literature.
major comments (2)
- [Framework and main results (abstract and § on Blackwell ordering)] The central claim (abstract) asserts that Blackwell refinement on single-user channels implies strict dominance for every adversary relevant to frequency estimation. Frequency estimation, however, requires the analyst to recover the frequency vector from the multiset of n independent reports (or the resulting histogram). The paper must verify that the refinement order is preserved under independent composition when the adversary's utility is defined on the aggregate statistic (e.g., ℓ_{1} or ℓ_{2} error on the recovered frequencies) rather than on any single report. No indication is given that this lift is performed or that the class of adversaries is restricted to those whose utility depends only on single reports.
- [Treatment of GRR, BLH, OLH, SUE, OUE, THE] § on the seven protocols: the dominance or incomparability results are stated for the single-channel ordering. The manuscript must exhibit the concrete channel matrices (or their equivalence classes) and the explicit post-processing functions that witness the claimed refinements, so that the reader can check whether those witnesses survive the product construction.
minor comments (2)
- Notation for the lifted adversary utility on the histogram should be introduced explicitly when the single-channel ordering is first defined.
- The abstract states that the approach 'bridges the gap between the LDP and formal methods communities'; a short paragraph contrasting the QIF refinement with existing LDP composition theorems would make this bridge concrete.
Simulated Author's Rebuttal
We thank the referee for the detailed and insightful comments. The points raised highlight important aspects of applying the framework to frequency estimation. We address each major comment below and will revise the manuscript to incorporate clarifications and additional details.
read point-by-point responses
-
Referee: [Framework and main results (abstract and § on Blackwell ordering)] The central claim (abstract) asserts that Blackwell refinement on single-user channels implies strict dominance for every adversary relevant to frequency estimation. Frequency estimation, however, requires the analyst to recover the frequency vector from the multiset of n independent reports (or the resulting histogram). The paper must verify that the refinement order is preserved under independent composition when the adversary's utility is defined on the aggregate statistic (e.g., ℓ_{1} or ℓ_{2} error on the recovered frequencies) rather than on any single report. No indication is given that this lift is performed or that the class of adversaries is restricted to those whose utility depends only on single reports.
Authors: Blackwell refinement is preserved under independent composition. If channel C1 refines C2 via a post-processor P (C2 = P ∘ C1), then the n-fold product satisfies C2^n = P^n ∘ C1^n, with P applied independently to each report. Thus C1^n refines C2^n for any utility function on the collection of reports, including those depending on aggregate statistics such as ℓ1 or ℓ2 error on frequency vectors. This follows directly from the definition of refinement, which holds for all possible utilities on the output space. We will add an explicit statement and short proof sketch of this preservation property to the section on Blackwell ordering. revision: yes
-
Referee: [Treatment of GRR, BLH, OLH, SUE, OUE, THE] § on the seven protocols: the dominance or incomparability results are stated for the single-channel ordering. The manuscript must exhibit the concrete channel matrices (or their equivalence classes) and the explicit post-processing functions that witness the claimed refinements, so that the reader can check whether those witnesses survive the product construction.
Authors: We agree that explicit matrices and witnessing post-processors will aid verification. The orderings in the manuscript are derived from protocol definitions, but the concrete stochastic matrices and post-processing functions are not listed in full. We will add an appendix containing the channel matrices for each of the seven protocols together with the explicit post-processing maps establishing the claimed refinements (or counterexample utilities for incomparabilities). Because these post-processors act componentwise, they extend immediately to the product channels. revision: yes
Circularity Check
No circularity: framework applies external QIF/Blackwell concepts to LDP protocols without self-referential reduction
full rationale
The paper models LDP mechanisms as channels and applies Blackwell ordering (refinement) from QIF to compare protocols for frequency estimation. No equations or claims reduce a result to a fitted parameter, self-definition, or load-bearing self-citation chain. The seven-protocol analysis and claims of incomparability/dominance rest on the external ordering applied to the modeled channels, not on any internal construction that forces the outcome by definition. This is self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption LDP mechanisms can be modeled as probabilistic channels
- domain assumption Blackwell ordering determines intrinsic superiority for all adversaries
Reference graph
Works this paper leans on
-
[1]
What can we learn privately?,
S. P. Kasiviswanathan, H. K. Lee, K. Nissim, S. Raskhodnikova, and A. Smith, “What can we learn privately?,”SIAM Journal on Computing, vol. 40, no. 3, pp. 793–826, 2011
2011
-
[2]
Calibrating noise to sensitivity in private data analysis,
C. Dwork, F. McSherry, K. Nissim, and A. Smith, “Calibrating noise to sensitivity in private data analysis,” inTheory of Cryptography, pp. 265– 284, Springer Berlin Heidelberg, 2006
2006
-
[3]
RAPPOR: Randomized aggregatable privacy-preserving ordinal response,
U. Erlingsson, V . Pihur, and A. Korolova, “RAPPOR: Randomized aggregatable privacy-preserving ordinal response,” inProceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, (New York, NY , USA), pp. 1054–1067, ACM, 2014
2014
-
[4]
Learning with privacy at scale,
Apple Differential Privacy Team, “Learning with privacy at scale,”
-
[5]
https://docs-assets.developer.apple.com/ml-research/papers/ learning-with-privacy-at-scale.pdf, (accessed January 2023)
2023
-
[6]
Private federated discovery of out-of-vocabulary words for gboard,
Z. Sun, P. Kairouz, H. Sun, A. Gascon, and A. T. Suresh, “Private federated discovery of out-of-vocabulary words for gboard,”arXiv preprint arXiv:2404.11607, 2024
-
[7]
Collecting telemetry data privately,
B. Ding, J. Kulkarni, and S. Yekhanin, “Collecting telemetry data privately,” inAdvances in Neural Information Processing Systems 30 (I. Guyon, U. V . Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vish- wanathan, and R. Garnett, eds.), pp. 3571–3580, Curran Associates, Inc., 2017
2017
-
[8]
Locally differentially private protocols for frequency estimation,
T. Wang, J. Blocki, N. Li, and S. Jha, “Locally differentially private protocols for frequency estimation,” in26th USENIX Security Sympo- sium (USENIX Security 17), (Vancouver, BC), pp. 729–745, USENIX Association, Aug. 2017
2017
-
[9]
An adversarial approach to protocol analysis and selection in local differen- tial privacy,
M. Emre Gursoy, L. Liu, K.-H. Chow, S. Truex, and W. Wei, “An adversarial approach to protocol analysis and selection in local differen- tial privacy,”IEEE Transactions on Information Forensics and Security, vol. 17, pp. 1785–1799, 2022
2022
-
[10]
On the risks of collecting multidimensional data under local differential privacy,
H. H. Arcolezi, S. Gambs, J.-F. Couchot, and C. Palamidessi, “On the risks of collecting multidimensional data under local differential privacy,”Proc. VLDB Endow., vol. 16, pp. 1126–1139, jan 2023
2023
-
[11]
M. S. Alvim, K. Chatzikokolakis, A. McIver, C. Morgan, C. Palamidessi, and G. Smith,The science of quantitative information flow. Springer, 2020
2020
-
[12]
Discrete distribution esti- mation under local privacy,
P. Kairouz, K. Bonawitz, and D. Ramage, “Discrete distribution esti- mation under local privacy,” inInternational Conference on Machine Learning, pp. 2436–2444, PMLR, 2016
2016
-
[13]
Optimal schemes for discrete distribution estima- tion under locally differential privacy,
M. Ye and A. Barg, “Optimal schemes for discrete distribution estima- tion under locally differential privacy,”IEEE Transactions on Informa- tion Theory, vol. 64, no. 8, pp. 5662–5676, 2018
2018
-
[14]
Mutual Information Optimally Local Private Discrete Distribution Estimation
S. Wang, L. Huang, P. Wang, Y . Nie, H. Xu, W. Yang, X.-Y . Li, and C. Qiao, “Mutual information optimally local private discrete distribution estimation,”arXiv preprint arXiv:1607.08025, 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[15]
Local, private, efficient protocols for succinct histograms,
R. Bassily and A. Smith, “Local, private, efficient protocols for succinct histograms,” inProceedings of the Forty-Seventh Annual ACM Sym- posium on Theory of Computing, STOC ’15, (New York, NY , USA), pp. 127–135, Association for Computing Machinery, 2015
2015
-
[16]
Linkedin’s audience engagements api: A privacy preserving data analytics system at scale,
R. Rogers, S. Subramaniam, S. Peng, D. Durfee, S. Lee, S. K. Kancha, S. Sahay, and P. Ahammad, “Linkedin’s audience engagements api: A privacy preserving data analytics system at scale,”Journal of Privacy and Confidentiality, vol. 11, no. 3, 2021
2021
-
[17]
Gaussian differential privacy,
J. Dong, A. Roth, and W. J. Su, “Gaussian differential privacy,”Journal of the Royal Statistical Society Series B: Statistical Methodology, vol. 84, pp. 3–37, 02 2022
2022
-
[18]
Composition theorems for f-differential privacy,
N. Fernandes, A. McIver, and P. Sadeghi, “Composition theorems for f-differential privacy,”arXiv preprint arXiv:2512.21358, 2025
-
[19]
Improving count-mean sketch as the leading locally differen- tially private frequency estimator for large dictionaries,
M. Pan, “Improving count-mean sketch as the leading locally differen- tially private frequency estimator for large dictionaries,” in2025 IEEE 38th Computer Security Foundations Symposium (CSF), pp. 97–112, 2025
2025
-
[20]
Revealing the true cost of locally differentially private protocols: An auditing perspective,
H. H. Arcolezi and S. Gambs, “Revealing the true cost of locally differentially private protocols: An auditing perspective,”Proceedings on Privacy Enhancing Technologies, vol. 2024, no. 4, pp. 123–141, 2024
2024
-
[21]
Budget inference attacks and counter- measures in locally differentially private data collection,
B. K. Balioglu and E. Gursoy, “Budget inference attacks and counter- measures in locally differentially private data collection,”ACM Trans. Internet Technol., vol. 26, Jan. 2026
2026
-
[22]
On the relation between differential privacy and quan- titative information flow,
M. S. Alvim, M. E. Andr ´es, K. Chatzikokolakis, P. Degano, and C. Palamidessi, “On the relation between differential privacy and quan- titative information flow,” inAutomata, Languages and Programming (ICALP 2011), vol. 6756 ofLecture Notes in Computer Science, pp. 60– 76, Springer, 2011
2011
-
[23]
Information-theoretic bounds for differential privacy,
G. Barthe and B. K ¨opf, “Information-theoretic bounds for differential privacy,” inProceedings of the 24th IEEE Computer Security Founda- tions Symposium (CSF), pp. 191–204, IEEE, 2011
2011
-
[24]
On the information leakage of differentially-private mechanisms,
M. S. Alvim, M. E. Andr ´es, K. Chatzikokolakis, P. Degano, and C. Palamidessi, “On the information leakage of differentially-private mechanisms,”Journal of Computer Security, vol. 23, no. 4, pp. 427–469, 2015
2015
-
[25]
Comparing systems: Max-case refinement orders and application to differential privacy,
K. Chatzikokolakis, N. Fernandes, and C. Palamidessi, “Comparing systems: Max-case refinement orders and application to differential privacy,” in32nd IEEE Computer Security Foundations Symposium, CSF 2019, Hoboken, NJ, USA, June 25-28, 2019, pp. 442–457, IEEE, 2019
2019
-
[26]
An operational approach to information leakage,
I. Issa, A. B. Wagner, and S. Kamath, “An operational approach to information leakage,”IEEE Transactions on Information Theory, vol. 66, no. 3, pp. 1625–1657, 2020
2020
-
[27]
Information- theoretic metrics for local differential privacy protocols,
M. Lopuha ¨a-Zwakenberg, B. ˇSkori´c, and N. Li, “Information- theoretic metrics for local differential privacy protocols,”arXiv, vol. abs/1910.07826, 2019
-
[28]
Analyzing the shuffle model through the lens of quantitative information flow,
M. Jurado, R. G. Gonze, M. S. Alvim, and C. Palamidessi, “Analyzing the shuffle model through the lens of quantitative information flow,” in2023 IEEE 36th Computer Security Foundations Symposium (CSF), pp. 423–438, IEEE, 2023
2023
-
[29]
Measuring information leakage using generalized gain functions,
M. S. Alvim, K. Chatzikokolakis, C. Palamidessi, and G. Smith, “Measuring information leakage using generalized gain functions,” in 2012 IEEE 25th Computer Security Foundations Symposium, pp. 265– 279, IEEE, 2012
2012
-
[30]
Quantitative notions of leakage for one-try attacks,
C. Braun, K. Chatzikokolakis, and C. Palamidessi, “Quantitative notions of leakage for one-try attacks,” inProceedings of the 25th Confer- ence on Mathematical Foundations of Programming Semantics, MFPS 2009, Oxford, UK, April 3-7, 2009(S. Abramsky, M. W. Mislove, and C. Palamidessi, eds.), vol. 249 ofElectronic Notes in Theoretical Computer Science, pp. ...
2009
-
[31]
Explainingϵin local differential privacy through the lens of quantitative information flow,
N. Fernandes, A. McIver, and P. Sadeghi, “Explainingϵin local differential privacy through the lens of quantitative information flow,” in2024 IEEE 37th Computer Security Foundations Symposium (CSF), pp. 419–432, 2024
2024
-
[32]
Universal optimality and robust utility bounds for metric differential privacy,
N. Fernandes, A. McIver, C. Palamidessi, and M. Ding, “Universal optimality and robust utility bounds for metric differential privacy,” in35th IEEE Computer Security Foundations Symposium, CSF 2022, Haifa, Israel, August 7-10, 2022, pp. 348–363, IEEE, 2022
2022
-
[33]
Kosarak dataset
“Kosarak dataset.” https://fimi.uantwerpen.be/data/. Frequent Itemset Mining Dataset Repository
-
[34]
Improving the utility of locally differentially private protocols for longitudinal and multidimensional frequency estimates,
H. H. Arcolezi, J.-F. Couchot, B. A. Bouna, and X. Xiao, “Improving the utility of locally differentially private protocols for longitudinal and multidimensional frequency estimates,”Digital Communications and Networks, vol. 10, no. 2, pp. 369–379, 2024
2024
-
[35]
Felip: A local differentially private approach to frequency estimation on multidimensional datasets,
J. S. Costa Filho and J. C. Machado, “Felip: A local differentially private approach to frequency estimation on multidimensional datasets,” inProceedings of the 26th International Conference on Extending Database Technology, EDBT 2023, Ioannina, Greece, March 28 - March 31, 2023, pp. 671–683, OpenProceedings.org, 2023
2023
-
[36]
Broadening the scope of differential privacy using metrics,
K. Chatzikokolakis, M. E. Andr ´es, N. E. Bordenabe, and C. Palamidessi, “Broadening the scope of differential privacy using metrics,” inPrivacy Enhancing Technologies - 13th International Symposium, PETS 2013, Bloomington, IN, USA, July 10-12, 2013. Proceedings(E. D. Cristofaro and M. K. Wright, eds.), vol. 7981 ofLecture Notes in Computer Science, pp. 8...
2013
-
[37]
Geo-indistinguishability: differential privacy for location-based sys- tems,
M. E. Andr ´es, N. E. Bordenabe, K. Chatzikokolakis, and C. Palamidessi, “Geo-indistinguishability: differential privacy for location-based sys- tems,” in2013 ACM SIGSAC Conference on Computer and Com- munications Security, CCS’13, Berlin, Germany, November 4-8, 2013 (A. Sadeghi, V . D. Gligor, and M. Yung, eds.), pp. 901–914, ACM, 2013
2013
-
[38]
A novel analysis of utility in privacy pipelines, using kronecker products and quantitative information flow,
M. S. Alvim, N. Fernandes, A. McIver, C. Morgan, and G. H. Nunes, “A novel analysis of utility in privacy pipelines, using kronecker products and quantitative information flow,” inProceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security, CCS 2023, Copenhagen, Denmark, November 26-30, 2023(W. Meng, C. D. Jensen, C. Cremers, and...
2023
-
[39]
The posteriors ofDunder a uniform prior are inside the convex hull of the posteriors ofC
-
[40]
Proof.It is already known that (1) implies (2) (see [10])
βC 1−αC ≤ βD 1−αD and 1−βC αC ≥ 1−βD αD . Proof.It is already known that (1) implies (2) (see [10]). We will prove that (2) implies (1), (2) implies (3) and (3) implies (2). (2) implies (1): For this, we need to prove a final requirement for refinement, that there exists an averaging that takes the posteriors ofC onto the posteriors ofD, averaged using th...
-
[41]
First encodex(channelE)
-
[42]
Definition ofE: The channel is defined asE:X →DY, where Y={(y h, ye)|y h ∈Handy e ∈[g]}
Apply GRR to the encoded value (channelG). Definition ofE: The channel is defined asE:X →DY, where Y={(y h, ye)|y h ∈Handy e ∈[g]}. The channel receives a secretx∈ X= [k]as input and outputs a pairy= (y h, ye), wherey h ={(1, w i), . . . ,(k, wk)|w k ∈[g]}is a hash function chosen uniformly at random fromHandy e is the encoded value ofx. The channel matri...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.