Interpreting the Error of Differentially Private Median Queries through Randomization Intervals
Pith reviewed 2026-05-10 16:54 UTC · model grok-4.3
The pith
PostRI computes a randomization interval after releasing a differentially private median to preserve higher utility.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
PostRI enables the release of a differentially private median followed by a post-hoc computation of a randomization interval that bounds the error introduced by the DP noise mechanism. Because the interval is derived after the median release, the median itself can be computed with substantially less noise than in earlier approaches that had to entangle the two steps. The result is a median whose utility is 14 to 850 percent higher than related work while the accompanying interval remains narrow.
What carries the argument
PostRI, a post-release procedure that constructs a randomization interval for the already-released differentially private median without additional privacy cost.
If this is right
- Median estimates under differential privacy can be made closer to the non-private value while still supplying an interpretable error bound.
- Analysts no longer face an explicit tradeoff between median accuracy and the ability to understand the scale of the added noise.
- The same separation of release and interval computation may extend to other statistics whose noise scale depends on the input data.
- Libraries implementing differential privacy can add automatic post-release interval support for median queries without changing the underlying privacy mechanism.
Where Pith is reading between the lines
- If PostRI works for medians, similar post-processing might be developed for other order statistics or for quantiles that also have data-dependent noise.
- Widespread adoption would let data curators publish more accurate private medians for applications such as income or health statistics while still giving users concrete error ranges.
- The method could be tested on real-world datasets to measure how often the post-computed intervals are narrow enough to be useful in practice.
Load-bearing premise
Computing and releasing the randomization interval after the median has already been released does not create new privacy leakage or invalidate the interval guarantees.
What would settle it
An attack or simulation in which an adversary, given only the released median and the later randomization interval, recovers more information about the private dataset than the original differential privacy budget permits.
Figures
read the original abstract
It can be difficult for practitioners to interpret the quality of differentially private (DP) statistics due to the added noise. One method to help analysts understand the amount of error introduced by DP is to return a Randomization Interval (RI), along with the statistic. A RI is a type of confidence interval that bounds the error introduced by DP. For queries where the noise distribution depends on the input, such as the median, prior work degrades the quality of the median itself to obtain a high-quality RI. In this work, we propose PostRI, a solution to compute a RI after the median has been estimated. PostRI enables a median estimation with 14%-850% higher utility than related work, while maintaining a narrow RI.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes PostRI, a post-processing method to compute a randomization interval (RI) after first releasing a differentially private median estimate. Unlike prior work that degrades median utility to obtain a high-quality RI for input-dependent noise, PostRI claims to deliver 14%-850% higher utility for the median while preserving a narrow RI that bounds the DP-induced error.
Significance. If the joint privacy guarantee and coverage properties hold, the result would be a useful practical advance for making DP medians more interpretable without utility sacrifice. The approach correctly exploits the fact that the initial median release consumes the privacy budget, but its value hinges on whether the subsequent RI step can be shown to add no extra leakage when the noise distribution depends on the data.
major comments (2)
- [§4] §4 (Privacy Analysis of PostRI): The claim that the joint (median, RI) output satisfies the original (ε,δ)-DP guarantee without additional budget is load-bearing. The RI computation is data-dependent and occurs after the median release; please supply the explicit composition argument or reduction showing that re-accessing the input for the RI does not violate post-processing or inflate the effective privacy loss. Standard post-processing applies only to functions of the already-released output.
- [§5.3] §5.3 (Utility Experiments): The reported 14%-850% utility gains are central to the contribution. Clarify the exact baselines, privacy budgets, datasets, and RI-width controls used for each end of the range; without these details it is unclear whether the gains are robust or arise only under specific parameter regimes.
minor comments (2)
- [Abstract] Abstract: The utility improvement range is stated without reference to the privacy parameter or mechanism; a single sentence on the DP setting would improve context.
- [Notation] Notation: The symbols for the lower and upper bounds of the RI and for the data-dependent noise scale should be defined once and used consistently across sections and figures.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed comments. We address each major point below and will revise the manuscript to incorporate the requested clarifications on privacy analysis and experimental details.
read point-by-point responses
-
Referee: [§4] §4 (Privacy Analysis of PostRI): The claim that the joint (median, RI) output satisfies the original (ε,δ)-DP guarantee without additional budget is load-bearing. The RI computation is data-dependent and occurs after the median release; please supply the explicit composition argument or reduction showing that re-accessing the input for the RI does not violate post-processing or inflate the effective privacy loss. Standard post-processing applies only to functions of the already-released output.
Authors: We agree that an explicit argument is necessary to substantiate the joint privacy claim. The manuscript currently invokes the post-processing theorem after the median release consumes the full budget, but does not detail how the data-dependent RI step avoids additional leakage. We will revise Section 4 to include a formal reduction: the RI is computed from the released median value together with publicly known parameters (noise distribution family, sensitivity bounds, and the fixed privacy parameters), without further queries to the private dataset. This reduction shows that the joint output is a (possibly randomized) function of the already-released DP median alone, preserving the original (ε,δ) guarantee. The revised section will contain the full argument. revision: yes
-
Referee: [§5.3] §5.3 (Utility Experiments): The reported 14%-850% utility gains are central to the contribution. Clarify the exact baselines, privacy budgets, datasets, and RI-width controls used for each end of the range; without these details it is unclear whether the gains are robust or arise only under specific parameter regimes.
Authors: We acknowledge that the range of reported gains requires precise contextualization. We will expand Section 5.3 with a table that enumerates, for each reported percentage, the exact baseline method, the privacy budget ε (and δ if applicable), the dataset (synthetic and real-world instances), the number of repetitions, and the RI-width control (e.g., fixed absolute width or quantile-based). This will demonstrate that the improvements hold across the tested regimes rather than in isolated settings. The revision will be made. revision: yes
Circularity Check
No significant circularity; PostRI derivation is self-contained
full rationale
The paper introduces PostRI as a method to compute a randomization interval after releasing a DP median estimate, claiming improved utility over prior approaches that degrade the median quality. The abstract and described approach rely on standard differential privacy post-processing and composition properties applied to an existing median mechanism, with utility gains presented via direct empirical comparison rather than any fitted parameter renamed as a prediction or self-referential definition. No load-bearing step in the provided description reduces by construction to its own inputs, self-citation chains, or ansatz smuggling; the central claims remain independent of the target result.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
- [1]
-
[2]
Edith Cohen, Xin Lyu, Jelani Nelson, Tamás Sarlós, and Uri Stemmer
-
[3]
InThe Thirty Seventh Annual Conference on Learning Theory
Lower bounds for differential privacy under continual obser- vation and online threshold queries. InThe Thirty Seventh Annual Conference on Learning Theory. PMLR, 1200–1222
- [4]
-
[5]
Irit Dinur and Kobbi Nissim. 2003. Revealing information while pre- serving privacy. InProceedings of the Twenty-Second ACM SIGACT- SIGMOD-SIGART Symposium on Principles of Database Systems, June 9-12, 2003, San Diego, CA, USA, Frank Neven, Catriel Beeri, and Tova Milo (Eds.). ACM, 202–210. doi:10.1145/773153.773173
-
[6]
Jörg Drechsler, Ira Globus-Harris, Audra Mcmillan, Jayshree Sarathy, and Adam Smith. 2022. Nonparametric differentially private con- fidence intervals for the median.Journal of Survey Statistics and Methodology10, 3 (2022), 804–829
work page 2022
- [7]
-
[8]
Cynthia Dwork. 2006. Differential privacy. InIN ICALP. Springer
work page 2006
-
[9]
Cynthia Dwork and Jing Lei. 2009. Differential privacy and robust statistics. InProceedings of the forty-first annual ACM symposium on Theory of computing. 371–380
work page 2009
-
[10]
Cynthia Dwork and Aaron Roth. 2014. The Algorithmic Foundations of Differential Privacy.Found. Trends Theor. Comput. Sci.(2014)
work page 2014
-
[11]
Jennifer Gillenwater, Matthew Joseph, Andres Munoz, and Mon- ica Ribero Diaz. 2022-07-17/2022-07-23. A Joint Exponential Mech- anism for Differentially Private Top-k. InProceedings of the 39th In- ternational Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 162), Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvari...
work page 2022
-
[12]
Michael Hay, Ashwin Machanavajjhala, Gerome Miklau, Yan Chen, and Dan Zhang. 2016. Principled evaluation of differentially private algorithms using dpbench. InProceedings of the 2016 International Conference on Management of Data. 139–154
work page 2016
-
[13]
2017.Differential privacy: From theory to practice
Ninghui Li, Min Lyu, Dong Su, and Weining Yang. 2017.Differential privacy: From theory to practice. Springer
work page 2017
-
[14]
Katrina Ligett, Moshe Shenfeld, Tomer Shoham, and Noa Velner- Harris. 2025. DIFFERENTIALLY PRIVATE NON-PARAMETRIC CON- FIDENCE INTERVALS.Journal of Privacy and Confidentiality(2025)
work page 2025
-
[15]
Jiaxiang Liu, Karl Knopf, Yiqing Tan, Bolin Ding, and Xi He. 2021. Catch a blowfish alive: a demonstration of policy-aware differential privacy for interactive data exploration.Proceedings of the VLDB Endowment14, 12 (2021), 2859–2862
work page 2021
- [16]
-
[17]
Frank McSherry and Kunal Talwar. 2007. Mechanism Design via Differential Privacy. InProceedings of the 48th Annual IEEE Sympo- sium on Foundations of Computer Science(USA, 2007)(FOCS ’07). IEEE Computer Society, 94–103. doi:10.1109/FOCS.2007.41
- [18]
-
[19]
Liudas Panavas, Amit Sarker, Sara Di Bartolomeo, Ali Sarvghad, Cody Dunne, and Narges Mahyar. 2024. Illuminating the Landscape of Differential Privacy: An Interview Study on the Use of Visualization in Real-World Deployments.IEEE Transactions on Visualization and Computer Graphics(2024)
work page 2024
-
[20]
Dajun Sun, Wei Dong, and Ke Yi. 2023. Confidence Intervals for Private Query Processing.Proceedings of the VLDB Endowment17, 3 (2023), 373–385
work page 2023
-
[21]
Siyuan Xia, Beizhen Chang, Karl Knopf, Yihan He, Yuchao Tao, and Xi He. 2021. Dpgraph: A benchmark platform for differentially private graph analysis. InProceedings of the 2021 International Conference on Management of Data. 2808–2812. A Proofs A.1 Proof of Theorem 3.2 We first prove the following lemma. Lemma A.1.𝑓has sensitivity1. Formally: max 𝐷,𝐷 ′ ∈ ...
work page 2021
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.