arxiv: 2605.06502 · v1 · submitted 2026-05-07 · 💻 cs.CR

Recognition: unknown

Privacy by Postprocessing the Discrete Laplace Mechanism

Quentin Hillebrand , Jacob Imola , Rasmus Pagh , Sia Sejer

Authors on Pith no claims yet

Pith reviewed 2026-05-08 08:56 UTC · model grok-4.3

classification 💻 cs.CR

keywords differential privacydiscrete Laplace mechanismpostprocessingunbiased estimationsubexponential functionsentropy estimationprofile estimation

0 comments

The pith

Postprocessing the discrete Laplace mechanism produces unbiased estimators of subexponential functions and replicates other privacy mechanisms.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that the discrete Laplace mechanism can be postprocessed to create unbiased estimators for any subexponential function of the data. It can also be postprocessed to produce the same distribution as the Laplace or Staircase mechanisms with the same privacy. This versatility suggests preferring the discrete Laplace for discrete data, with demonstrated efficiency for structured functions and applications in estimation tasks. A reader would care because it offers a way to achieve unbiased private estimates without changing the core mechanism.

Core claim

The discrete Laplace mechanism, after appropriate postprocessing, yields a simple unbiased estimator for any subexponential function f of the original data. It can also be postprocessed to match the output distribution of the Laplace mechanism or the Staircase mechanism while keeping the privacy parameters the same. Bounds on the variance of this estimator are provided in comparison to the biased alternative of evaluating f directly on the noisy output.

What carries the argument

Postprocessing applied to the output of the discrete Laplace mechanism to unbias it or to match other distributions.

Load-bearing premise

The data is discrete or discretizable with controlled l1-sensitivity and the target functions are subexponential.

What would settle it

Compute the unbiased estimator for a simple subexponential function such as the sum on synthetic discrete data and check if its expectation equals the true value.

Figures

Figures reproduced from arXiv: 2605.06502 by Jacob Imola, Quentin Hillebrand, Rasmus Pagh, Sia Sejer.

**Figure 1.** Figure 1: Examples where our estimator compares favorably to the naive estimator on real datasets. view at source ↗

**Figure 2.** Figure 2: Illustration of Theorem 10 for ε = 2 (so p = e −2 ). Each red curve is one translated component Pr[η = k] qLap(z − k) corresponding to an atom of the discrete Laplace distribution η ∼ DLap(e −ε ). Summing all components yields exactly the continuous Laplace density ε 2 e −ε|z| shown in black. 15 view at source ↗

**Figure 3.** Figure 3: Illustration of Theorem 17 for ε = 1 and γ = 1 4 (so p = e −1 ). Each red dashed curve is one translated component Pr[η = k] qγ(z − k) corresponding to an atom of the discrete Laplace distribution η ∼ DLap(e −ε ). Summing all components yields exactly the Staircase density shown in black. Define the partial sum P(k) = P z:∥z∥1≤k f(x + z)p ∥z∥1 . We will show that for any α > 0, there exists k0 such that fo… view at source ↗

**Figure 4.** Figure 4: Root mean square error the number of k-stars (normalized) under varying privacy budgets. E.1 Estimation of Polynomials For the first application, we consider publishing the number of k-stars in a graph. A k-star is a subgraph consisting of a central node connected to P k neighbors. As such, it can be expressed as i view at source ↗

**Figure 5.** Figure 5: Mean square error of the entropy estimation on the bag of words of NIPS articles. Quantiles view at source ↗

**Figure 6.** Figure 6: Partition function estimation on synthetic datasets. view at source ↗

**Figure 7.** Figure 7: Partition function estimation on real datasets. view at source ↗

**Figure 8.** Figure 8: Profile estimation on synthetic datasets. view at source ↗

**Figure 9.** Figure 9: Profile estimation on real datasets. 28 view at source ↗

read the original abstract

We show that an "old dog", the classical discrete Laplace (aka.~geometric) mechanism, can "perform new tricks": 1. It can be post-processed to yield a simple, unbiased estimator of any subexponential function $f$ of the original data, giving a simple, discrete, multivariate version of the recent unbiasing result for the Laplace mechanism by Calmon et al. (FORC '25). 2. It can be post-processed to output the same distribution as the Laplace mechanism or the Staircase mechanism with identical privacy parameters. Thus, the discrete Laplace mechanism is a versatile mechanism that should be preferred over the Laplace and Staircase mechanisms whenever the data is discrete (or can be made discrete while controlling $\ell_1$-sensitivity). We show bounds on the variance of our estimator, compared to the mean square error of the biased estimator that simply evaluates the $f$ on the output of the mechanism. Though our unbiased estimator has exponential running time for worst-case functions, we show that it can often be computed in linear or polynomial time for some common functions exhibiting structure. We showcase the properties of our methods empirically with several use cases including profile and entropy estimation, as well as distributed/federated data analysis applications in which unbiasedness is key to accuracy.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper shows clean postprocessing for unbiased estimation and distribution matching on the discrete Laplace mechanism, but the preference claim needs an efficiency qualifier for general functions.

read the letter

The main contribution is showing how to post-process the geometric mechanism to get an unbiased estimator for any subexponential f, plus a way to make its outputs match the Laplace or Staircase distributions exactly. This is a direct discrete extension of the Calmon et al. unbiasing result, and the matching part looks new. They do a good job laying out variance bounds and showing that for functions with structure, like entropy or profiles, the postprocessing can be done efficiently in linear or polynomial time. The federated examples highlight why unbiasedness matters in practice. The empirical section backs up the claims for those cases. The soft spot is the final recommendation. The paper argues that the discrete Laplace should be preferred whenever data is discrete because of these tricks. But as the stress-test note points out, the unbiased postprocessor is exponential time for arbitrary subexponential f. The paper is upfront about this and gives efficient algorithms only when f has additional structure. So the preference holds mainly for those structured cases, not as a general rule. That is a real qualification, not a minor one. No issues with the core math or citations from what is shown. The work engages honestly with the prior literature. This is aimed at differential privacy people who need unbiased estimates from discrete mechanisms or want mechanism equivalence. It is solid enough for peer review; the ideas are clear and the limitations are discussed. I would send it to referees.

Referee Report

1 major / 1 minor

Summary. The paper claims that the discrete Laplace (geometric) mechanism can be post-processed to yield an unbiased estimator for any subexponential function f of the original data (a multivariate extension of recent Laplace unbiasing results), and can also be post-processed to exactly match the output distributions of the Laplace or Staircase mechanisms under the same privacy parameters. It derives variance bounds for the unbiased estimator relative to the MSE of the naive biased estimator, shows that the postprocessor runs in linear or polynomial time for structured f (e.g., profile and entropy estimation) despite exponential worst-case time, and empirically validates the approach on profile/entropy estimation and federated data analysis tasks. The central conclusion is that the discrete Laplace mechanism should therefore be preferred over Laplace and Staircase mechanisms whenever data is discrete or can be discretized while controlling ℓ1-sensitivity.

Significance. If the postprocessing constructions, variance bounds, and efficiency results for structured functions hold, the work would offer a practical unification of mechanisms in differential privacy, enabling unbiased estimation in discrete settings where bias accumulation harms accuracy (e.g., federated statistics). The explicit efficiency results for common structured functions and empirical demonstrations strengthen applicability beyond worst-case analysis.

major comments (1)

[Abstract] Abstract: The recommendation that the discrete Laplace mechanism 'should be preferred' over Laplace and Staircase 'whenever the data is discrete (or can be made discrete while controlling ℓ1-sensitivity)' is load-bearing on postprocessing being tractable in practice. The manuscript correctly notes exponential runtime for arbitrary subexponential f but only derives linear/polynomial algorithms when f has additional structure (e.g., the showcased profile/entropy cases). No general bound is provided on the runtime in terms of sensitivity, dimension, or properties of typical application functions, leaving the scope of the preference claim unclear.

minor comments (1)

[Abstract] The abstract states that the unbiased estimator 'can often be computed in linear or polynomial time for some common functions exhibiting structure' without defining 'structure' formally or giving a general criterion that would allow readers to assess new functions.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their thoughtful review and constructive feedback on the abstract's recommendation. We address the major comment below and will make the suggested revisions to clarify the scope of our claims.

read point-by-point responses

Referee: [Abstract] Abstract: The recommendation that the discrete Laplace mechanism 'should be preferred' over Laplace and Staircase 'whenever the data is discrete (or can be made discrete while controlling ℓ1-sensitivity)' is load-bearing on postprocessing being tractable in practice. The manuscript correctly notes exponential runtime for arbitrary subexponential f but only derives linear/polynomial algorithms when f has additional structure (e.g., the showcased profile/entropy cases). No general bound is provided on the runtime in terms of sensitivity, dimension, or properties of typical application functions, leaving the scope of the preference claim unclear.

Authors: We agree that the abstract's preference recommendation is broad and depends on the postprocessing being tractable. The manuscript already notes the exponential worst-case runtime for arbitrary subexponential f and derives linear/polynomial-time algorithms only for structured f (e.g., profile and entropy estimation). No general runtime bound is provided because it inherently depends on the specific structure of f, which varies by application; a uniform bound over the entire subexponential class would require stronger assumptions not assumed in the paper. To address this, we will revise the abstract (and related claims in the introduction and conclusion) to qualify the preference to settings where f has structure permitting efficient postprocessing, such as the common cases demonstrated in the applications section. This will better align the recommendation with the paper's results on efficiency. revision: yes

Circularity Check

0 steps flagged

No circularity: postprocessing derivations are independent extensions of cited prior work

full rationale

The paper derives explicit postprocessing procedures for unbiased estimation of subexponential functions and for matching Laplace/Staircase output distributions, building directly on standard DP definitions and the external Calmon et al. unbiasing result. No load-bearing step reduces by construction to a fitted parameter, self-referential definition, or self-citation chain; runtime bounds and structure-specific algorithms are stated separately with explicit caveats for worst-case exponential time. The preference recommendation follows from these independent constructions rather than tautological renaming or imported uniqueness.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The work rests on standard differential privacy axioms (post-processing immunity, definition of (ε,δ)-DP) and properties of the discrete Laplace distribution; no free parameters or invented entities are introduced in the abstract.

axioms (2)

standard math Standard (ε,δ)-differential privacy definition and post-processing property hold for the discrete Laplace mechanism.
Invoked implicitly when claiming the postprocessed outputs preserve privacy parameters.
domain assumption The target functions f are subexponential.
Required for the unbiased estimator to exist and for variance bounds.

pith-pipeline@v0.9.0 · 5534 in / 1330 out tokens · 41553 ms · 2026-05-08T08:56:07.401815+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

12 extracted references · 5 canonical work pages · 1 internal anchor

[1]

Debiasing Functions of Private Statistics in Postprocessing

Flavio Calmon, Elbert Du, Cynthia Dwork, Brian Finley, and Grigory Franguridi. Debiasing Functions of Private Statistics in Postprocessing. In Mark Bun, editor,6th Symposium on Foundations of Responsible Computing (FORC 2025), volume 329 ofLeibniz International Proceedings in Informatics (LIPIcs), pages 17:1–17:18, Dagstuhl, Germany,

2025
[2]

ISBN 978-3-95977-367-6

Schloss Dagstuhl – Leibniz- Zentrum für Informatik. ISBN 978-3-95977-367-6. doi: 10.4230/LIPIcs.FORC.2025.17. Albert Cheu, Adam Smith, Jonathan Ullman, David Zeber, and Maxim Zhilyaev. Distributed differential privacy via shuffling. InAnnual international conference on the theory and applications of cryptographic techniques, pages 375–403. Springer,

work page doi:10.4230/lipics.forc.2025.17 2025
[3]

In: Halevi, S., Rabin, T

doi: 10.1007/11681878_14. Úlfar Erlingsson, Vitaly Feldman, Ilya Mironov, Ananth Raghunathan, Kunal Talwar, and Abhradeep Thakurta. Amplification by shuffling: From local to central differential privacy via anonymity. InProceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 2468–2479. SIAM,

work page doi:10.1007/11681878_14
[4]

Universally utility-maximizing privacy mechanisms

Arpita Ghosh, Tim Roughgarden, and Mukund Sundararajan. Universally utility-maximizing privacy mechanisms. InSTOC 2009, pages 351–360,

2009
[5]

Theory of Point Estimation

ISBN 0387985026. doi: 10.1007/b98854. Jure Leskovec and Julian McAuley. Learning to discover social circles in ego networks. In F. Pereira, C.J. Burges, L. Bottou, and K. Weinberger, editors,Advances in Neural Information Processing Systems, volume

work page doi:10.1007/b98854
[6]

Günter F

DOI: https://doi.org/10.24432/C5ZG6P. Günter F. Steinke and Thomas Steinke. Privately estimating black-box statistics.CoRR, abs/2510.00322,

work page doi:10.24432/c5zg6p
[7]

control bars

doi: 10.48550/ARXIV .2510.00322. Hao Wu and Rasmus Pagh. Profile reconstruction from private sketches. InForty-first International Conference on Machine Learning (ICML),

work page internal anchor Pith review doi:10.48550/arxiv
[8]

Conditioned onE, we have P i ˜xi = 0, so |g(˜x)|= |1−β0| 2 ≥Ω (2c)n/n

By standard concentration bounds,Pr[E] = Θ(n −1/2). Conditioned onE, we have P i ˜xi = 0, so |g(˜x)|= |1−β0| 2 ≥Ω (2c)n/n . SinceE[g(˜x)] = 0, Var (g(˜x))≥Pr[E]·Ω (2c)2n/n2 = Ω (2c)2n/n5/2 , which is exponential since2c >1. C Details on Transformations to Other Mechanisms C.1 Illustration of the Conversion to the Laplace Mechanism −3 −2 −1 0 1 2 3 0 0.5 1...

2014
[9]

Denote the above region as Ra,b =S + A,a ×S − B,b × {−1,0,1} [n]\A\B

This means the ξ where f(x+ξ) =k−1 can be decomposed as the following disjoint union of sets |B|G b=0 b−d1G a=0 S+ A,a ×S − B,b × {−1,0,1} [n]\A\B. Denote the above region as Ra,b =S + A,a ×S − B,b × {−1,0,1} [n]\A\B. The volumes of the regions are vol(Ra,b) =vol(S + A,a)vol(S− B,b) = |A| a |B| b αa 1(α−1 +α 0)|A|−aαb −1(α0 +α 1)|B|−b. Similarly, let b+ d...

2025
[10]

We evaluate this task on the Facebook dataset from Leskovec and McAuley [2012]

and Imola, Murakami, and Chaudhuri [2021]. We evaluate this task on the Facebook dataset from Leskovec and McAuley [2012]. This graph, where each node represents a user and edges represent friendships, contains 4039 nodes and 88,234 edges. We compare four algorithms: • Unbiased DLap: the number of k-stars is computed using our algorithm by post-processing...

2021
[11]

The goal is to estimate, from the published counts, the value of the entropy nX i=1 xi s log s xi . 24 10 1 100 Epsilon 10 2 10 1 100 101 102 Normalized RMSE Entropy of NIPS Bag of Words Dataset Naive Estimator Median Naive Estimator 10th-90th Percentile Unbiased Estimator Median Unbiased Estimator 10th-90th Percentile Figure 5: Mean square error of the e...

2008
[12]

and the shuffle model anonymized histogram estimators of Ghazi, Kamath, Kumar, and Manurangsi [2022]. An unbiased estimator for this setting can easily be obtained from the threshold estimator of [Ghazi et al., 2022], so our estimator is not algorithmically novel in this setting, but we include empirical results for completeness. (Prior works on private p...

2022