pith. sign in

arxiv: 2601.02855 · v2 · submitted 2026-01-06 · 💻 cs.IT · cs.CR· math.IT

Context-aware Privacy Bounds for Linear Queries

Pith reviewed 2026-05-16 17:30 UTC · model grok-4.3

classification 💻 cs.IT cs.CRmath.IT
keywords privacy boundslinear queriespointwise maximal leakagedifferential privacyLaplace mechanismcontext-awaredata releaseprivacy leakage
0
0 comments X

The pith

Incorporating prior distribution knowledge yields a tighter leakage bound than differential privacy for linear queries.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper aims to show that standard differential privacy requires more noise than necessary for protecting linear queries when some prior information about the data is available. By using pointwise maximal leakage and assuming a lower bound on the probability of records belonging to classes, they derive a context-aware bound. This bound is strictly tighter than the DP guarantee but approaches it as the probability bound goes to zero. A sympathetic reader would care because it allows releasing data with less added noise while still controlling privacy leakage, improving the utility of analyses based on linear queries.

Core claim

For the Laplace mechanism applied to general linear queries, we derive a tight context-aware privacy bound using pointwise maximal leakage that incorporates a lower bound on the prior probability of any record belonging to a specific class. This bound is strictly smaller than the standard differential privacy guarantee and converges to it as the probability lower bound approaches zero.

What carries the argument

Pointwise maximal leakage (PML) with a context-aware adjustment based on a prior probability lower bound, which quantifies the maximum leakage for each possible output.

If this is right

  • The noise scale required by the Laplace mechanism can be reduced while maintaining the same privacy level.
  • The bound applies to general linear queries, not just specific ones.
  • As the assumed probability lower bound approaches zero, the bound recovers the standard DP guarantee.
  • Exploiting prior knowledge reduces the noise needed for privacy in data releases.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Estimating or assuming such probability lower bounds in real datasets could lead to significant utility improvements in privacy-preserving data publishing.
  • This method might be adaptable to other privacy mechanisms or query types beyond linear ones.
  • Future work could explore how to robustly set the probability lower bound without leaking additional information.

Load-bearing premise

There exists a positive lower bound on the probability that any single record belongs to any specific class according to the prior distribution.

What would settle it

A simulation where the actual prior probability is below the assumed lower bound and the observed leakage exceeds the derived bound would falsify the claim.

Figures

Figures reproduced from arXiv: 2601.02855 by 2), (2) Inria Saclay), Heng Zhao (1), Sara Saeidian (1, Tobias J. Oechtering (1) ((1) KTH Royal Institute of Technology.

Figure 1
Figure 1. Figure 1: Comparison of privacy bounds for different workload type (leakage vs. prior parameter [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Noise scale b vs. privacy parameter ε. This figure illustrates the minimum noise scale b required to ensure a target privacy leakage ε for the difference query workload (9). The solid (bPML(i)), dashed (bPML(ii)), and dash-dotted (bDP) lines denote the minimum noise scales required for Thm. 2, Cor. 1, and εDP to be at most ε, respectively. B. Numerical Evaluations Here, we empirically evaluate our bounds a… view at source ↗
read the original abstract

Linear queries, as the basis of broad analysis tasks, are often released through privacy mechanisms based on differential privacy (DP), the most popular framework for privacy protection. However, DP adopts a context-free definition that operates independently of the data-generating distribution. In this paper, we revisit the privacy analysis of the Laplace mechanism through the lens of pointwise maximal leakage (PML). We demonstrate that the distribution-agnostic definition of the DP framework often mandates excessive noise. To address this, we incorporate an assumption about the prior distribution by lower-bounding the probability of any single record belonging to any specific class. With this assumption, we derive a tight, context-aware leakage bound for general linear queries, and prove that our derived bound is strictly tighter than the standard DP guarantee and converges to the DP guarantee as this probability lower bound approaches zero. Numerical evaluations demonstrate that by exploiting this prior knowledge, the required noise scale can be reduced while maintaining privacy guarantees.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript revisits the Laplace mechanism for linear queries through pointwise maximal leakage (PML). By introducing a lower bound p_min on the probability that any single record belongs to any specific class, the authors derive a context-aware leakage bound claimed to be strictly tighter than the standard differential privacy (DP) guarantee. They prove convergence to the DP bound as p_min approaches zero and present numerical evaluations indicating that the required noise scale can be reduced while preserving the privacy guarantee.

Significance. If the central derivation holds, the work offers a principled way to tighten privacy bounds for common linear queries by exploiting limited prior information, which could improve utility without sacrificing formal guarantees. The explicit convergence result and the use of PML as a finer-grained measure than DP are positive features.

major comments (2)
  1. [Main theorem and its proof (likely §4)] The proof that the per-record p_min lower bound induces a strictly tighter PML bound for general linear queries Ax appears to proceed from marginal probabilities alone. Because linear queries aggregate multiple records, P(Ax = y) depends on the joint prior over the full dataset; without an explicit independence or correlation assumption, the claimed strict improvement over DP does not necessarily hold uniformly. This assumption must be stated and justified in the main theorem.
  2. [Convergence argument following the main bound] The convergence claim as p_min → 0 is stated, but the rate and the conditions under which the bound remains valid for finite p_min need to be made explicit; the current argument may only recover DP in the limit without guaranteeing the intermediate strict improvement for all linear queries.
minor comments (2)
  1. [Preliminaries] Clarify the precise definition of the linear query matrix A and the output space in the notation section to avoid ambiguity when the query dimension differs from the record dimension.
  2. [Numerical evaluations] The numerical evaluation section would benefit from an explicit statement of the dataset size, query matrix dimensions, and the exact p_min values used in the plots.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful and constructive review. The comments on the main theorem and convergence argument are helpful for improving clarity. We address each point below and will revise the manuscript to incorporate the requested clarifications.

read point-by-point responses
  1. Referee: The proof that the per-record p_min lower bound induces a strictly tighter PML bound for general linear queries Ax appears to proceed from marginal probabilities alone. Because linear queries aggregate multiple records, P(Ax = y) depends on the joint prior over the full dataset; without an explicit independence or correlation assumption, the claimed strict improvement over DP does not necessarily hold uniformly. This assumption must be stated and justified in the main theorem.

    Authors: We appreciate this observation. The derivation bounds P(Ax = y) by considering the worst-case joint distribution consistent with the per-record marginal lower bound p_min. To obtain a uniform strict improvement, the analysis assumes record independence under the prior (a standard modeling choice in context-aware privacy to ensure tractability when only marginal information is available). We agree this must be stated explicitly. In the revised manuscript we will add a clear statement of the independence assumption directly in the main theorem (Section 4) together with a short justification referencing common practice in the PML and context-aware DP literature. revision: yes

  2. Referee: The convergence claim as p_min → 0 is stated, but the rate and the conditions under which the bound remains valid for finite p_min need to be made explicit; the current argument may only recover DP in the limit without guaranteeing the intermediate strict improvement for all linear queries.

    Authors: Thank you for highlighting the need for greater precision. The proof establishes convergence by taking the limit of the closed-form PML expression as p_min → 0. For any fixed p_min > 0 the bound is strictly smaller than the DP guarantee because the support restriction induced by p_min reduces the worst-case leakage; this holds for every linear query Ax under the independence assumption. In the revision we will add an explicit remark on the convergence rate (linear in p_min for small p_min) and restate the conditions under which the strict improvement applies for all finite p_min ∈ (0,1] and all linear queries. revision: yes

Circularity Check

0 steps flagged

Derivation self-contained from PML properties plus explicit p_min assumption

full rationale

The paper starts from the definition of pointwise maximal leakage, introduces an explicit modeling assumption that lower-bounds the marginal probability of any record belonging to any class, and then algebraically derives a context-aware leakage expression for linear queries. The resulting bound is shown to be strictly smaller than the DP guarantee for any positive p_min and to recover the DP guarantee in the limit p_min -> 0. These steps are direct consequences of the stated assumptions and the PML definition; no parameter is fitted to data, no quantity is renamed as a prediction, and no load-bearing step reduces to a self-citation or self-definition. The skeptic concern about joint priors is a question of proof validity, not circularity.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The claim rests on the definition and properties of pointwise maximal leakage together with the Laplace mechanism and the introduced probability lower-bound assumption.

free parameters (1)
  • probability lower bound
    Lower bound on the probability that any single record belongs to any specific class; used to tighten the leakage bound.
axioms (2)
  • standard math Properties of pointwise maximal leakage
    Relies on the established definition and properties of PML for the leakage analysis.
  • domain assumption Laplace mechanism noise addition
    Standard assumption that the Laplace mechanism is used for releasing linear query answers.

pith-pipeline@v0.9.0 · 5486 in / 1242 out tokens · 54718 ms · 2026-05-16T17:30:04.659508+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

24 extracted references · 24 canonical work pages

  1. [1]

    Calibrating noise to sensitivity in private data analysis,

    C. Dwork, F. McSherry, K. Nissim, and A. Smith, “Calibrating noise to sensitivity in private data analysis,” inProceedings of the Third Conference on Theory of Cryptography, ser. TCC’06, New York, NY: Springer-Verlag, 2006, pp. 265–284,ISBN: 3540327312.DOI: 10.1007/ 11681878_14

  2. [2]

    2014.The Algorithmic Foundations of Differential Privacy

    C. Dwork and A. Roth, “The algorithmic foundations of differential privacy,”Found. Trends Theor. Comput. Sci., vol. 9, no. 3–4, pp. 211– 407, Aug. 2014,ISSN: 1551-305X.DOI: 10.1561/0400000042

  3. [3]

    No free lunch in data privacy,

    D. Kifer and A. Machanavajjhala, “No free lunch in data privacy,” in Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data, ser. SIGMOD ’11, Athens, Greece: Association for Computing Machinery, 2011, pp. 193–204,ISBN: 9781450306614. DOI: 10.1145/1989323.1989345

  4. [4]

    A rigorous and customizable frame- work for privacy,

    D. Kifer and A. Machanavajjhala, “A rigorous and customizable frame- work for privacy,” inProceedings of the 31st ACM SIGMOD-SIGACT- SIGAI Symposium on Principles of Database Systems, ser. PODS ’12, Scottsdale, Arizona, USA: Association for Computing Machinery, 2012, pp. 77–88,ISBN: 9781450312486.DOI: 10 . 1145 / 2213556 . 2213571

  5. [5]

    Evaluating differential privacy on correlated datasets using pointwise maximal leakage,

    S. Saeidian, T. J. Oechtering, and M. Skoglund, “Evaluating differential privacy on correlated datasets using pointwise maximal leakage,” in Privacy Technologies and Policy, Cham: Springer Nature Switzerland, 2024, pp. 73–86.DOI: 10.1007/978-3-031-68024-3_4

  6. [6]

    Pointwise maximal leakage,

    S. Saeidian, G. Cervia, T. J. Oechtering, and M. Skoglund, “Pointwise maximal leakage,” in2022 IEEE International Symposium on Infor- mation Theory (ISIT), 2022, pp. 626–631.DOI: 10.1109/ISIT50566. 2022.9834814

  7. [7]

    Rethinking disclosure prevention with pointwise maximal leakage,

    S. Saeidian, G. Cervia, T. J. Oechtering, and M. Skoglund, “Rethinking disclosure prevention with pointwise maximal leakage,”Journal of Privacy and Confidentiality, vol. 15, no. 1, Mar. 2025.DOI: 10.29012/ jpc.893

  8. [8]

    An adaptive mechanism for accurate query answering under differential privacy,

    C. Li and G. Miklau, “An adaptive mechanism for accurate query answering under differential privacy,”Proc. VLDB Endow., vol. 5, no. 6, pp. 514–525, Feb. 2012,ISSN: 2150-8097.DOI: 10 . 14778 / 2168651.2168653

  9. [9]

    The matrix mechanism: Optimizing linear counting queries under differential privacy,

    C. Li, G. Miklau, M. Hay, A. McGregor, and V . Rastogi, “The matrix mechanism: Optimizing linear counting queries under differential privacy,”The VLDB journal, vol. 24, no. 6, pp. 757–781, 2015

  10. [10]

    A tight context-aware privacy bound for histogram publication,

    S. Saeidian, A. Yavuzyılmaz, L. Grosse, G. Schuppe, and T. J. Oechter- ing, “A tight context-aware privacy bound for histogram publication,” IEEE Signal Processing Letters, vol. 32, pp. 4169–4173, 2025.DOI: 10.1109/LSP.2025.3620776

  11. [11]

    Opti- mizing error of high-dimensional statistical queries under differential privacy,

    R. McKenna, G. Miklau, M. Hay, and A. Machanavajjhala, “Opti- mizing error of high-dimensional statistical queries under differential privacy,”Proc. VLDB Endow., vol. 11, no. 10, pp. 1206–1219, Jun. 2018,ISSN: 2150-8097.DOI: 10.14778/3231751.3231769

  12. [12]

    A data- and workload-aware algorithm for range queries under differential privacy,

    C. Li, M. Hay, G. Miklau, and Y . Wang, “A data- and workload-aware algorithm for range queries under differential privacy,”Proc. VLDB Endow., vol. 7, no. 5, pp. 341–352, Jan. 2014,ISSN: 2150-8097.DOI: 10.14778/2732269.2732271

  13. [13]

    A simple and practical algorithm for differentially private data release,

    M. Hardt, K. Ligett, and F. Mcsherry, “A simple and practical algorithm for differentially private data release,” inAdvances in Neural Information Processing Systems, F. Pereira, C. Burges, L. Bottou, and K. Weinberger, Eds., vol. 25, Curran Associates, Inc., 2012

  14. [14]

    AIM: an adaptive and iterative mechanism for differentially private synthetic data

    R. McKenna, B. Mullins, D. Sheldon, and G. Miklau, “Aim: An adaptive and iterative mechanism for differentially private synthetic data,”Proc. VLDB Endow., vol. 15, no. 11, pp. 2599–2612, Jul. 2022, ISSN: 2150-8097.DOI: 10.14778/3551793.3551817

  15. [15]

    Bluhm, ´A

    S. Saeidian, G. Cervia, T. J. Oechtering, and M. Skoglund, “Pointwise maximal leakage on general alphabets,” in2023 IEEE International Symposium on Information Theory (ISIT), 2023, pp. 388–393.DOI: 10.1109/ISIT54713.2023.10206975

  16. [16]

    M. S. Alvim, K. Chatzikokolakis, A. McIver, C. Morgan, C. Palamidessi, and G. Smith,The Science of Quantitative Information Flow. Springer Cham, 2020

  17. [17]

    An operational approach to information leakage,

    I. Issa, A. B. Wagner, and S. Kamath, “An operational approach to information leakage,”IEEE Transactions on Information Theory, vol. 66, no. 3, pp. 1625–1657, 2020.DOI: 10.1109/TIT.2019.2962804

  18. [18]

    Alvim, Kostas Chatzikokolakis, Catuscia Palamidessi, and Geoffrey Smith

    M. S. Alvim, K. Chatzikokolakis, C. Palamidessi, and G. Smith, “Measuring information leakage using generalized gain functions,” in 2012 IEEE 25th Computer Security Foundations Symposium, 2012, pp. 265–279.DOI: 10.1109/CSF.2012.26

  19. [19]

    Additive and multiplicative notions of leakage, and their capacities,

    M. S. Alvim, K. Chatzikokolakis, A. Mciver, C. Morgan, C. Palamidessi, and G. Smith, “Additive and multiplicative notions of leakage, and their capacities,” in2014 IEEE 27th Computer Security Foundations Symposium, 2014, pp. 308–322.DOI: 10.1109/CSF.2014. 29

  20. [20]

    On measures of entropy and information,

    A. Rényi, “On measures of entropy and information,” inProceedings of the fourth Berkeley symposium on mathematical statistics and probability, Berkeley, California, USA, vol. 1, 1961, pp. 547–561

  21. [21]

    URL https://doi.org/10.1109/TIT.2014.2320500

    T. van Erven and P. Harremoës, “Rényi divergence and Kullback- Leibler divergence,”IEEE Transactions on Information Theory, vol. 60, no. 7, pp. 3797–3820, 2014.DOI: 10.1109/TIT.2014.2320500

  22. [22]

    Kamalaruban, V

    P. Kamalaruban, V . Perrier, H. J. Asghar, and M. A. Kaafar,Not all attributes are created equal:d X -private mechanisms for linear queries, 2019. arXiv: 1806.02389[stat.ML]

  23. [23]

    Privacy, accuracy, and consistency too: A holistic solution to contingency table release,

    B. Barak, K. Chaudhuri, C. Dwork, S. Kale, F. McSherry, and K. Talwar, “Privacy, accuracy, and consistency too: A holistic solution to contingency table release,” inProceedings of the Twenty-Sixth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, ser. PODS ’07, Beijing, China: Association for Computing Machinery, 2007, pp. 273–282,ISBN...

  24. [24]

    Differential privacy and the risk-utility tradeoff for multi-dimensional contingency tables,

    S. E. Fienberg, A. Rinaldo, and X. Yang, “Differential privacy and the risk-utility tradeoff for multi-dimensional contingency tables,” inProceedings of the 2010 International Conference on Privacy in Statistical Databases, ser. PSD’10, Corfu, Greece: Springer-Verlag, 2010, pp. 187–199,ISBN: 3642158374. APPENDIXA AUXILIARYLEMMAS FORPROOF OFTHEOREM2 Lemma ...