pith. sign in

arxiv: 2605.21601 · v1 · pith:EQHTDIKLnew · submitted 2026-05-20 · 💻 cs.DS · cs.CR

Near-Optimal Generalized Private Testing

Pith reviewed 2026-05-22 08:34 UTC · model grok-4.3

classification 💻 cs.DS cs.CR
keywords differential privacygeneralized private testingthresholding mechanismcontinual observationblack-box reductionsample complexitynear-optimality
4
0 comments X

The pith

The Generalized Thresholding Mechanism achieves pure differential privacy for testing a sequence of mechanisms while bounding evaluations near-optimally.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces the Generalized Thresholding Mechanism (GTM) to solve the generalized private testing problem introduced by Liu and Talwar. For any sequence of black-box (ε_t, δ_t)-DP mechanisms, GTM ensures the overall procedure remains pure ε-DP. With probability 1-β it accepts the first mechanism whose success probability meets or exceeds a given threshold p* and rejects those below a lower threshold p-bar, while using a number of evaluations bounded by O((ln(t/β)/(γ-1)^2) max(Λ_t/p*, (1-p*)^{-1})) for parameters θ, γ, β and the defined Λ_t. The authors prove matching lower bounds establishing near-optimality of both the accuracy gap and the sample complexity. This generalizes the Sparse Vector Technique and supplies a black-box reduction from continual-observation DP optimization to the batch setting plus support for adaptive thresholds.

Core claim

For any sequence of (ε_t, δ_t)-DP mechanisms M_t, the GTM is pure ε-DP. For θ>0, γ∈(1,2], β∈(0,1) it sets p-bar_t = max(p*/γΛ_t, 1-γΛ_t(1-p*)) - δ_t/ε_t where Λ_t=(5t ln^3(t+2))^{(2+θ)ε_t/ε} (4/β)^{(3+θ+2/θ)ε_t/ε}. With probability 1-β the number of evaluations of M_t is at most O((ln(t/β)/(γ-1)^2) max(Λ_t/p*, (1-p*)^{-1})) for all t≥1, accepting the first mechanism with p_t≥p* while rejecting those with p_t≤p-bar_t. Lower bounds prove the accuracy and sample-complexity guarantees are near-optimal.

What carries the argument

Generalized Thresholding Mechanism (GTM), which composes privacy losses across mechanisms and applies concentration inequalities to decide acceptance or rejection of each black-box mechanism while preserving overall pure ε-DP.

If this is right

  • Yields the first DP algorithms for many maximization problems in the continual observation setting via a black-box reduction to the batch setting.
  • Permits an adaptive choice of acceptance thresholds (p*_t) for applications such as hyperparameter optimization.
  • Establishes near-optimality of the accuracy gap and evaluation bounds through matching lower bounds.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The black-box reduction may allow existing batch DP optimization routines to be applied directly to online continual-observation problems.
  • Adaptive thresholds could simplify private hyperparameter search by letting the acceptance criterion change with observed performance.
  • The logarithmic dependence on t suggests the method remains efficient even for very long sequences of candidate mechanisms.

Load-bearing premise

The success probabilities p_t are fixed non-adaptive properties of each mechanism and the failure probability β can be chosen independently of the data.

What would settle it

A sequence of mechanisms in which later success probabilities p_t change depending on whether earlier mechanisms were accepted or rejected would cause the accuracy guarantee to fail.

read the original abstract

In differential privacy (DP), the generalized private testing problem was introduced by Liu and Talwar (STOC 2019). Given a dataset $X \in \mathcal{X}$ and a sequence of black-box $\varepsilon_t$-DP mechanisms $M_t:\mathcal{X}\to\{+1,-1\}$, the analyst must accept the first mechanism whose success probability $p_t=\Pr[M_t(X)=+1]$ exceeds a given threshold $p^*\in(0,1)$, while achieving DP. Accuracy is measured by the gap between $p^*$ and a rejection threshold $\bar{p}$, such that with probability $1-\beta$ for all $t\geq1$, if $p_t\leq\bar{p}$, then $M_t$ is rejected, and if $p_t\geq p^*$, then it is accepted. This generalizes the standard private testing problem, whose solution, the Sparse Vector Technique, is ubiquitous in DP. We introduce the Generalized Thresholding Mechanism (GTM) for generalized private testing. For $\varepsilon>0$ and any sequence of $(\varepsilon_t,\delta_t)$-DP mechanisms $M_t$, the GTM is pure $\varepsilon$-DP. For $\theta>0$, $\gamma\in(1,2]$, and $\beta\in(0,1)$, $\bar{p}_t=\max(p^*/\gamma\Lambda_t, 1 - \gamma\Lambda_t(1-p^*))-\delta_t/\varepsilon_t$ for $\Lambda_t=(5t\ln^3(t+2))^{(2+\theta)\varepsilon_t/\varepsilon}(4/\beta)^{(3+\theta+2/\theta)\varepsilon_t/\varepsilon}$. With probability $1-\beta$, the number of evaluations of $M_t$ is at most $O((\ln(t/\beta)/(\gamma-1)^2)\max(\Lambda_t/p^*,(1-p^*)^{-1}))$ for all $t\geq 1$. Our lower bounds prove near-optimality of our accuracy and sample complexity guarantees. Via the GTM, we give a black-box reduction for DP optimization from the continual observation (CO) setting to the batch setting. This gives us the first DP-CO algorithms for many maximization problems. Further, the GTM permits an adaptive choice of acceptance thresholds $(p^*_t)_{t\geq1}$, addressing a challenge mentioned in prior work on using generalized private testing for hyperparameter optimization (Papernot and Steinke (ICLR 2022)).

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces the Generalized Thresholding Mechanism (GTM) for the generalized private testing problem in differential privacy. Given a dataset X and any sequence of black-box (ε_t, δ_t)-DP mechanisms M_t : X → {+1, -1}, the GTM is pure ε-DP. For parameters θ > 0, γ ∈ (1,2], β ∈ (0,1), it defines Λ_t = (5t ln³(t+2))^{(2+θ)ε_t/ε} (4/β)^{(3+θ+2/θ)ε_t/ε} and p-bar_t = max(p*/(γ Λ_t), 1 - γ Λ_t (1-p*)) - δ_t/ε_t. With probability 1-β, GTM accepts the first t with p_t ≥ p* and rejects those with p_t ≤ p-bar_t, using at most O((ln(t/β)/(γ-1)²) max(Λ_t/p*, (1-p*)^{-1})) evaluations of each M_t. Lower bounds establish near-optimality of the accuracy gap and sample complexity. Applications include a black-box reduction from continual-observation to batch DP optimization for maximization problems and support for adaptive acceptance thresholds p*_t.

Significance. If the central claims hold, the work provides a near-optimal generalization of the Sparse Vector Technique to the generalized private testing setting, with matching upper and lower bounds up to constants. The black-box reduction enabling the first DP-CO algorithms for many maximization problems is a concrete algorithmic contribution. The explicit support for adaptive (p*_t) sequences directly addresses a challenge noted in prior hyperparameter-optimization work. Machine-checked proofs or reproducible code are not mentioned, but the parameter-free character of the final bounds (once θ, γ, β are fixed) and the information-theoretic lower bounds are strengths.

major comments (2)
  1. [Theorem 1] Theorem 1 (and the problem definition in the abstract): the 1-β accuracy guarantee is derived under the assumption that each p_t is a fixed, deterministic constant for the given dataset X, independent of the GTM’s internal randomness and accept/reject decisions. If the sequence (M_t) may be chosen adaptively—so that the numerical value of p_t can depend on prior GTM outcomes—the events {p_t ≥ p*} and {GTM accepts t} become dependent; the concentration and union-bound arguments used to absorb the growing Λ_t factor no longer necessarily deliver the claimed probability bound.
  2. [Definition of p-bar_t and Λ_t] Definition of p-bar_t and Λ_t (abstract and §3): the subtraction of δ_t/ε_t and the specific exponent (2+θ)ε_t/ε on the t ln³(t+2) term are presented as absorbing the (ε_t, δ_t)-DP of each M_t into the pure ε-DP of GTM. The error analysis showing that this adjustment preserves the near-optimality gap (i.e., that the effective separation between p* and p-bar_t remains Θ(1) when Λ_t is not too large) is not visible in the provided derivation sketch; without it the claim that the bounds are “near-optimal” rests on an unverified constant-factor degradation.
minor comments (2)
  1. [Abstract] The exponent (3+θ+2/θ) in Λ_t appears in the sample-complexity bound but is not motivated in the abstract; a brief remark on its origin (e.g., from a particular concentration inequality) would improve readability.
  2. [Applications] The application to continual-observation optimization is stated at a high level; naming one concrete maximization problem (e.g., private top-k selection) for which the reduction yields the first DP-CO algorithm would make the claim more tangible.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful and constructive review of our manuscript. We address each major comment below with clarifications and indicate the revisions we will make. We believe these responses resolve the concerns while preserving the core contributions of the work.

read point-by-point responses
  1. Referee: [Theorem 1] Theorem 1 (and the problem definition in the abstract): the 1-β accuracy guarantee is derived under the assumption that each p_t is a fixed, deterministic constant for the given dataset X, independent of the GTM’s internal randomness and accept/reject decisions. If the sequence (M_t) may be chosen adaptively—so that the numerical value of p_t can depend on prior GTM outcomes—the events {p_t ≥ p*} and {GTM accepts t} become dependent; the concentration and union-bound arguments used to absorb the growing Λ_t factor no longer necessarily deliver the claimed probability bound.

    Authors: We thank the referee for this observation. The problem formulation, consistent with Liu and Talwar (STOC 2019), takes as input a fixed sequence of black-box mechanisms (M_t). Each p_t is therefore a deterministic quantity determined by the fixed dataset X and the mechanism M_t, independent of the GTM's internal randomness or its accept/reject decisions. The 1-β guarantee follows from applying concentration bounds and a union bound over this fixed sequence. We agree that an adaptive choice of the mechanisms themselves (where p_t could depend on prior GTM outputs) would introduce dependencies requiring different techniques such as martingale bounds. This adaptive-mechanism setting lies outside the stated problem. We will add an explicit sentence to the statement of Theorem 1 and to the problem definition in Section 2 clarifying that the sequence (M_t) is fixed in advance. revision: partial

  2. Referee: [Definition of p-bar_t and Λ_t] Definition of p-bar_t and Λ_t (abstract and §3): the subtraction of δ_t/ε_t and the specific exponent (2+θ)ε_t/ε on the t ln³(t+2) term are presented as absorbing the (ε_t, δ_t)-DP of each M_t into the pure ε-DP of GTM. The error analysis showing that this adjustment preserves the near-optimality gap (i.e., that the effective separation between p* and p-bar_t remains Θ(1) when Λ_t is not too large) is not visible in the provided derivation sketch; without it the claim that the bounds are “near-optimal” rests on an unverified constant-factor degradation.

    Authors: We appreciate the referee drawing attention to the need for explicit verification of the accuracy gap. The exponent (2+θ) on the t ln³(t+2) term arises from the privacy-composition analysis that allocates the overall ε budget across the growing number of mechanisms while leaving a θ-margin for concentration; the subtraction of δ_t/ε_t compensates for the approximate DP of each M_t. In the complete proof (Appendix, Section 4), we show that the resulting gap between p* and p-bar_t is larger than the information-theoretic optimum by only a multiplicative factor depending on γ and θ. Because γ can be chosen arbitrarily close to 1 and θ arbitrarily small, this factor can be made arbitrarily close to 1, preserving near-optimality up to constants that match the lower bounds. To make this reasoning visible without relying on the appendix, we will expand the paragraph immediately following the definition of p-bar_t and Λ_t with a short derivation of the constant-factor gap. revision: yes

Circularity Check

0 steps flagged

No significant circularity; GTM construction and bounds are self-contained

full rationale

The paper presents the Generalized Thresholding Mechanism (GTM) as an explicit new construction whose pure ε-DP guarantee follows from standard DP composition and the thresholding rule applied to any sequence of (ε_t, δ_t)-DP mechanisms. Accuracy and sample-complexity claims are obtained by applying concentration inequalities and a union bound whose failure probability is absorbed into the explicitly defined Λ_t factor; these are external probabilistic tools. The lower bounds are information-theoretic and proved independently of the upper-bound construction. No equation reduces a claimed prediction to a fitted parameter by construction, no load-bearing premise rests solely on a self-citation, and the modeling assumption that p_t are fixed properties of each M_t is stated up-front in the problem definition rather than smuggled in via the derivation itself. The derivation is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

3 free parameters · 2 axioms · 0 invented entities

The central claim rests on standard probabilistic concentration inequalities and the definition of (ε,δ)-DP; no new entities are postulated and the only free parameters are the user-chosen accuracy parameters θ, γ, β.

free parameters (3)
  • θ > 0
    User-chosen parameter that appears in the exponent of Λ_t; controls the polynomial dependence on t.
  • γ ∈ (1,2]
    User-chosen accuracy parameter that trades off the gap between p* and p-bar.
  • β ∈ (0,1)
    Failure probability that appears in the sample-complexity bound.
axioms (2)
  • standard math Standard Chernoff and union bounds over t mechanisms
    Invoked to bound the probability that any mechanism is misclassified.
  • domain assumption Each M_t is an (ε_t, δ_t)-DP black-box mechanism with fixed success probability p_t
    Core modeling assumption stated in the problem definition.

pith-pipeline@v0.9.0 · 6004 in / 1633 out tokens · 27065 ms · 2026-05-22T08:34:53.694453+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

17 extracted references · 17 canonical work pages · 1 internal anchor

  1. [1]

    Differentially pri- vate algorithms for graph cuts: A shifting mechanism approach and more.arXiv preprint arXiv:2407.06911,

    [CDFZ24] Rishi Chandra, Michael Dinitz, Chenglin Fan, and Zongrui Zou. Differentially pri- vate algorithms for graph cuts: A shifting mechanism approach and more.arXiv preprint arXiv:2407.06911,

  2. [2]

    Generalized private selection and testing with high confidence

    [CLN+23] Edith Cohen, Xin Lyu, Jelani Nelson, Tamás Sarlós, and Uri Stemmer. Generalized private selection and testing with high confidence. In14th Innovations in Theoreti- cal Computer Science Conference (ITCS 2023), pages 39–1. Schloss Dagstuhl–Leibniz- Zentrum für Informatik,

  3. [3]

    Nguyen, and Thy Dinh Nguyen

    [CNN23] Anamay Chaturvedi, Huy L. Nguyen, and Thy Dinh Nguyen. Streaming submod- ular maximization with differential privacy. In Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett, editors, International Conference on Machine Learning, ICML 2023, 23-29 July 2023, Honolulu, Hawaii, USA, volume 202 ofProce...

  4. [4]

    Differentially private decomposable submodular maximization

    62 [CNZ21] Anamay Chaturvedi, Huy Le Nguyen, and Lydia Zakynthinou. Differentially private decomposable submodular maximization. InThirty-Fifth AAAI Conference on Artifi- cial Intelligence, AAAI 2021, Virtual Event, February 2-9, 2021, pages 6984–6992. AAAI Press,

  5. [5]

    Almost tight bounds for differentially private densest subgraph

    [DKLV25] Michael Dinitz, Satyen Kale, Silvio Lattanzi, and Sergei Vassilvitskii. Almost tight bounds for differentially private densest subgraph. InProceedings of the 2025 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 2908–2950. SIAM,

  6. [6]

    Rothblum, and Salil P

    [DNR+09] Cynthia Dwork, Moni Naor, Omer Reingold, Guy N. Rothblum, and Salil P . Vad- han. On the complexity of differentially private data release: efficient algorithms and hardness results. In Michael Mitzenmacher, editor,Proceedings of the 41st Annual ACM Symposium on Theory of Computing, STOC 2009, Bethesda, MD, USA, May 31 - June 2, 2009, pages 381–390. ACM,

  7. [7]

    Sub- linear space graph algorithms in the continual release model.arXiv preprint arXiv:2407.17619,

    [ELMZ24] Alessandro Epasto, Quanquan C Liu, Tamalika Mukherjee, and Felix Zhou. Sub- linear space graph algorithms in the continual release model.arXiv preprint arXiv:2407.17619,

  8. [8]

    In- dividualized privacy accounting via subsampling with applications in combinatorial optimization.arXiv preprint arXiv:2405.18534,

    [GKK+24] Badih Ghazi, Pritish Kamath, Ravi Kumar, Pasin Manurangsi, and Adam Sealfon. In- dividualized privacy accounting via subsampling with applications in combinatorial optimization.arXiv preprint arXiv:2405.18534,

  9. [9]

    [HC22] Yiyang Huang and Clément L. Canonne. Lemmas of differential privacy.CoRR, abs/2211.11189,

  10. [10]

    A multiplicative weights mechanism for privacy-preserving data analysis

    [HR10] Moritz Hardt and Guy N Rothblum. A multiplicative weights mechanism for privacy-preserving data analysis. InFoundations of Computer Science (FOCS), 2010 51st Annual IEEE Symposium on, pages 61–70. IEEE,

  11. [11]

    Differentially private and scal- able estimation of the network principal component.arXiv preprint arXiv:2505.03858,

    [KVK25] Alireza Khayatian, Anil Vullikanti, and Aritra Konar. Differentially private and scal- able estimation of the network principal component.arXiv preprint arXiv:2505.03858,

  12. [12]

    Private Selection from Private Candidates

    63 [LT18] Jingcheng Liu and Kunal Talwar. Private selection from private candidates.arXiv preprint arXiv:1811.07971,

  13. [13]

    Private selection from private candidates

    [LT19] Jingcheng Liu and Kunal Talwar. Private selection from private candidates. In Moses Charikar and Edith Cohen, editors,Proceedings of the 51st Annual ACM SIGACT Sym- posium on Theory of Computing, STOC 2019, Phoenix, AZ, USA, June 23-26, 2019, pages 298–309. ACM,

  14. [14]

    Purifying approximate differential privacy with randomized post-processing.CoRR, abs/2503.21071,

    [LWMW25] Yingyu Lin, Erchi Wang, Yi-An Ma, and Yu-Xiang Wang. Purifying approximate differential privacy with randomized post-processing.CoRR, abs/2503.21071,

  15. [15]

    Almost-polynomial ratio eth-hardness of approximating densest k-subgraph

    [Man17] Pasin Manurangsi. Almost-polynomial ratio eth-hardness of approximating densest k-subgraph. In Hamed Hatami, Pierre McKenzie, and Valerie King, editors,Proceed- ings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2017, Montreal, QC, Canada, June 19-23, 2017, pages 954–961. ACM,

  16. [16]

    Hyperparameter tuning with renyi differen- tial privacy.arXiv preprint arXiv:2110.03620,

    [PS21] Nicolas Papernot and Thomas Steinke. Hyperparameter tuning with renyi differen- tial privacy.arXiv preprint arXiv:2110.03620,

  17. [17]

    Fast and private submodular and k-submodular functions maximization with matroid constraints

    [RY20] Akbar Rafiey and Yuichi Yoshida. Fast and private submodular and k-submodular functions maximization with matroid constraints. InProceedings of the 37th Interna- tional Conference on Machine Learning, ICML 2020, 13-18 July 2020, Virtual Event, vol- ume 119 ofProceedings of Machine Learning Research, pages 7887–7897. PMLR,