pith. sign in

arxiv: 2606.23096 · v1 · pith:5ZUWWHJHnew · submitted 2026-06-22 · 💻 cs.LG · cs.IT· math.IT

Minimax Quantile Lower Bounds for Interactive Statistical Decision Making with Privacy

Pith reviewed 2026-06-26 09:06 UTC · model grok-4.3

classification 💻 cs.LG cs.ITmath.IT
keywords minimax quantileinteractive Fano methodGaussian banditsprivacy budgetlower boundsinteractive statistical decision makinghigh-probability boundsmutual information privacy
0
0 comments X

The pith

Minimax quantile lower bounds for interactive decision making are explicit in confidence level and privacy budget.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a δ-explicit minimax-quantile theory for interactive statistical decision making to address the limitation that standard minimax risk and regret focus only on expectations. It supplies structural relations between quantiles and risk, plus two high-probability converse tools: an interactive Fano method and an interactive Le Cam method. Mutual-information privacy is incorporated by restricting the decision class, and a two-point template isolates the resulting variance inflation; these tools produce explicit lower bounds for Gaussian mean estimation and for K-armed Gaussian bandits.

Core claim

By deriving a high-probability interactive Fano method, the paper obtains a minimax quantile lower bound for the K-armed Gaussian bandit problem that accounts for exploration cost across multiple candidate best arms; the resulting bounds are stated directly in terms of the failure probability δ and the privacy budget, with privacy appearing as a Gaussian variance-inflation factor.

What carries the argument

High-probability interactive Fano's method, which converts mutual-information quantities into explicit lower bounds on the probability that the decision rule fails to meet a given performance threshold.

If this is right

  • Squared-error Gaussian mean estimation has a minimax quantile lower bound that scales as log(1/δ)/n.
  • Two-armed bounded-mean Gaussian bandits have a minimax quantile lower bound that scales as sqrt(T log(1/δ)).
  • K-armed Gaussian bandits have a minimax quantile lower bound of sqrt(KT log(1/δ)) type.
  • For private problems the same scalings hold after multiplication by a Gaussian variance-inflation factor obtained from the two-point template.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same high-probability Fano template could be applied to other interactive problems whose performance is measured by quantiles rather than expectations.
  • The variance-inflation factor derived for coordinatewise Gaussian privatization may serve as a plug-in adjustment when other local privacy mechanisms are used.
  • The equivalence between strict and lower minimax quantiles outside a countable set of δ values suggests that quantile-based analysis can replace risk-based analysis in most practical regimes.

Load-bearing premise

Mutual information privacy can be handled in the same framework by restricting the admissible decision class.

What would settle it

A concrete algorithm for the K-armed Gaussian bandit problem whose achieved δ-quantile risk falls strictly below the derived sqrt(KT log(1/δ)) expression (with the privacy variance factor) would falsify the lower bound.

read the original abstract

Minimax risk and regret are expectation-based criteria and do not capture rare but consequential failures. To address this concern, we develop a $\delta$-explicit minimax-quantile theory for interactive statistical decision making (ISDM). We first provide structural relations between minimax quantiles, lower minimax quantiles, and minimax risk. This includes a quantile-to-expectation conversion and an equivalence between strict and lower minimax quantiles outside a countable set of confidence levels. We then derive two converse tools for ISDM: a high-probability interactive Fano's method and a high-probability interactive Le Cam's method. Then, we show that mutual-information (MI) privacy can be handled in the same framework by restricting the admissible decision class. For coordinatewise Gaussian privatization, we derive a two-point template that isolates the privacy-induced variance inflation. We instantiate this template for Gaussian mean estimation, and use the same two-point strategy directly for two-armed Gaussian bandits. We then derive a minimax quantile lower bound for the $K$-armed Gaussian bandit problem, showing that the interactive Fano method captures the exploration cost over multiple possible best arms. The resulting lower bounds are explicit in the confidence level $\delta$ and in the privacy budget for the private problems. They yield $\log(1/\delta)/n$ scaling for squared-error Gaussian mean estimation, $\sqrt{T\log(1/\delta)}$ scaling for two-armed bounded-mean Gaussian bandits, and $\sqrt{KT\log(1/\delta)}$-type scaling for the $K$-armed bandits, with privacy appearing through a Gaussian variance-inflation factor for the private problems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper develops a δ-explicit minimax-quantile theory for interactive statistical decision making (ISDM). It establishes structural relations between minimax quantiles, lower minimax quantiles, and minimax risk (including a quantile-to-expectation conversion and equivalence outside countable δ values). It introduces two high-probability converse tools—an interactive Fano method and an interactive Le Cam method—and extends the framework to mutual-information privacy by restricting the admissible decision class. A two-point template isolates privacy-induced variance inflation under coordinatewise Gaussian privatization. The template is applied to Gaussian mean estimation and two-armed bandits, then extended via the interactive Fano method to derive explicit minimax quantile lower bounds for the K-armed Gaussian bandit problem. The resulting bounds are explicit in δ and the privacy budget, recovering scalings such as log(1/δ)/n for squared-error mean estimation, √(T log(1/δ)) for two-armed bandits, and √(KT log(1/δ))-type for K-armed bandits, with privacy entering through a Gaussian variance-inflation factor.

Significance. If the derivations hold, the work is significant because it supplies the first δ-explicit quantile lower bounds for interactive problems that directly incorporate privacy budgets and capture exploration costs over multiple arms. The structural relations and the two new high-probability converse tools provide reusable machinery beyond the specific instantiations. The explicit dependence on δ and privacy, together with the two-point template, yields falsifiable, parameter-explicit predictions that can be checked against existing expectation-based bounds.

major comments (2)
  1. [MI privacy extension (abstract paragraph and corresponding section)] Abstract and the section extending the framework to MI privacy: the claim that mutual-information privacy is handled simply by restricting the admissible decision class must be shown to bound the mutual information between the entire adaptive interaction transcript and the final output (not merely the terminal decision rule). If the restriction leaves the querying mechanism unconstrained, the two-point template and variance-inflation factor may understate leakage, which is load-bearing for all private lower bounds.
  2. [K-armed bandit lower bound derivation] Section deriving the K-armed Gaussian bandit lower bound: the interactive Fano method is asserted to capture the exploration cost over K possible best arms, yet the manuscript provides only a sketched combination of the two-point template with the high-probability converse. The precise manner in which the variance-inflation factor interacts with the multi-arm information term needs an explicit equation-level derivation, as this step is central to the claimed √(KT log(1/δ)) scaling.
minor comments (2)
  1. [Structural relations] Notation for the lower minimax quantile versus the strict minimax quantile should be introduced with a single consistent symbol set in the structural-relations section to avoid later ambiguity.
  2. [Two-point template] The two-point template for coordinatewise Gaussian privatization would benefit from an explicit statement of the admissible decision class before its use in the bandit instantiations.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive report. The two major comments identify places where additional clarification and explicit derivations are needed. We will revise the manuscript to address both points.

read point-by-point responses
  1. Referee: [MI privacy extension (abstract paragraph and corresponding section)] Abstract and the section extending the framework to MI privacy: the claim that mutual-information privacy is handled simply by restricting the admissible decision class must be shown to bound the mutual information between the entire adaptive interaction transcript and the final output (not merely the terminal decision rule). If the restriction leaves the querying mechanism unconstrained, the two-point template and variance-inflation factor may understate leakage, which is load-bearing for all private lower bounds.

    Authors: We agree that MI privacy must be defined with respect to the full adaptive transcript (queries plus outputs). In the ISDM framework the admissible decision class consists of interactive policies, i.e., mappings from the entire history of observations and queries to the next query and ultimately to the terminal output. Restricting this class to policies whose joint distribution over the transcript and output satisfies the MI bound therefore constrains both the querying mechanism and the final decision rule. The two-point template is applied only inside this restricted class. We will add an explicit paragraph in the MI-privacy section making this definition and the resulting bound on transcript leakage precise. revision: yes

  2. Referee: [K-armed bandit lower bound derivation] Section deriving the K-armed Gaussian bandit lower bound: the interactive Fano method is asserted to capture the exploration cost over K possible best arms, yet the manuscript provides only a sketched combination of the two-point template with the high-probability converse. The precise manner in which the variance-inflation factor interacts with the multi-arm information term needs an explicit equation-level derivation, as this step is central to the claimed √(KT log(1/δ)) scaling.

    Authors: The referee is correct that the current derivation of the K-armed bound is only sketched. We will replace the sketch with an explicit sequence of equations that (i) applies the two-point template to obtain the per-arm variance-inflation factor σ_priv², (ii) substitutes this factor into the KL divergence term inside the interactive Fano inequality, and (iii) carries the inflated divergence through the high-probability converse to produce the √(KT log(1/δ)) scaling. The revised section will contain the full chain of inequalities with all constants displayed. revision: yes

Circularity Check

0 steps flagged

No load-bearing circularity; derivations use standard inequalities on restricted classes

full rationale

The paper derives δ-explicit minimax quantile lower bounds via high-probability interactive Fano and Le Cam methods, then extends to MI privacy by restricting the admissible decision class. No quoted step reduces a prediction or bound to a fitted parameter or self-citation by construction; the two-point templates and variance-inflation factors are obtained from standard information-theoretic converses applied to the restricted class. The central claims remain independent of the paper's own outputs.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on domain assumptions from statistical decision theory and information theory; no free parameters are fitted to data and no new entities are postulated.

axioms (2)
  • domain assumption Equivalence between strict and lower minimax quantiles holds outside a countable set of confidence levels
    Invoked in the structural relations between minimax quantiles and minimax risk.
  • domain assumption Mutual information privacy can be incorporated by restricting the admissible decision class
    Used when extending the quantile theory to private ISDM problems.

pith-pipeline@v0.9.1-grok · 5841 in / 1440 out tokens · 33685 ms · 2026-06-26T09:06:25.457962+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

40 extracted references · 2 linked inside Pith

  1. [1]

    High-probability minimax lower bounds,

    T. Ma, K. A. Verchand, and R. J. Samworth, “High-probability minimax lower bounds,”arXiv preprint arXiv:2406.13447, 2024

  2. [2]

    Minimax policies for adversarial and stochastic bandits,

    J.-Y . Audibert and S. Bubeck, “Minimax policies for adversarial and stochastic bandits,” inCOLT, 2009, pp. 217–226

  3. [3]

    Minimax regret bounds for reinforcement learning,

    M. G. Azar, I. Osband, and R. Munos, “Minimax regret bounds for reinforcement learning,” inInternational conference on machine learning. PMLR, 2017, pp. 263–272

  4. [4]

    Is q-learning provably efficient?

    C. Jin, Z. Allen-Zhu, S. Bubeck, and M. I. Jordan, “Is q-learning provably efficient?”Advances in neural information processing systems, vol. 31, 2018

  5. [5]

    Lattimore and C

    T. Lattimore and C. Szepesv ´ari,Bandit algorithms. Cambridge University Press, 2020

  6. [6]

    The statistical complexity of interactive decision making,

    D. J. Foster, S. M. Kakade, J. Qian, and A. Rakhlin, “The statistical complexity of interactive decision making,”arXiv preprint arXiv:2112.13487, 2021

  7. [7]

    Convergence of estimates under dimensionality restrictions,

    L. LeCam, “Convergence of estimates under dimensionality restrictions,”The Annals of Statistics, pp. 38–53, 1973

  8. [8]

    Lecture notes on information theory,

    Y . Polyanskiy and Y . Wu, “Lecture notes on information theory,”Lecture Notes for ECE563 (UIUC) and, vol. 6, no. 2012-2016, p. 7, 2014

  9. [9]

    T. M. Cover,Elements of information theory. John Wiley & Sons, 1999

  10. [10]

    Assouad, fano, and le cam,

    B. Yu, “Assouad, fano, and le cam,” inFestschrift for Lucien Le Cam: research papers in probability and statistics. Springer, 1997, pp. 423–435

  11. [11]

    On bayes risk lower bounds,

    X. Chen, Y . Zhanget al., “On bayes risk lower bounds,”Journal of Machine Learning Research, vol. 17, no. 218, pp. 1–58, 2016

  12. [12]

    Distance-based and continuum fano inequalities with applications to statistical estimation,

    J. C. Duchi and M. J. Wainwright, “Distance-based and continuum fano inequalities with applications to statistical estimation,”arXiv preprint arXiv:1311.2669, 2013

  13. [13]

    Assouad, fano, and le cam with interaction: A unifying lower bound framework and characterization for bandit learnability,

    F. Chen, D. J. Foster, Y . Han, J. Qian, A. Rakhlin, and Y . Xu, “Assouad, fano, and le cam with interaction: A unifying lower bound framework and characterization for bandit learnability,”Advances in Neural Information Processing Systems, vol. 37, pp. 75 585–75 641, 2024

  14. [14]

    On the privacy-utility trade-off with and without direct access to the private data,

    A. Zamani, T. J. Oechtering, and M. Skoglund, “On the privacy-utility trade-off with and without direct access to the private data,”IEEE Transactions on Information Theory, vol. 70, no. 3, pp. 2177–2200, 2023

  15. [15]

    Bounds for privacy-utility trade-off with non-zero leakage,

    ——, “Bounds for privacy-utility trade-off with non-zero leakage,” in2022 IEEE International Symposium on Information Theory (ISIT). IEEE, 2022, pp. 620–625

  16. [16]

    Risk level dependent minimax quantile lower bounds for interactive statistical decision making,

    R. Bongole, A. Zamani, T. J. Oechtering, and M. Skoglund, “Risk level dependent minimax quantile lower bounds for interactive statistical decision making,” inICASSP 2026-2026 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2026, pp. 5466–5470

  17. [17]

    Utility-privacy tradeoffs in databases: An information-theoretic approach,

    L. Sankar, S. R. Rajagopalan, and H. V . Poor, “Utility-privacy tradeoffs in databases: An information-theoretic approach,”IEEE Transactions on Information Forensics and Security, vol. 8, no. 6, pp. 838–852, 2013

  18. [18]

    Information-theoretic metrics for security and privacy,

    F. d. P. Calmon, “Information-theoretic metrics for security and privacy,” Ph.D. dissertation, Massachusetts Institute of Technology, 2015

  19. [19]

    On the robustness of information-theoretic privacy measures and mechanisms,

    M. Diaz, H. Wang, F. P. Calmon, and L. Sankar, “On the robustness of information-theoretic privacy measures and mechanisms,”IEEE Transactions on Information Theory, vol. 66, no. 4, pp. 1949–1978, 2019

  20. [20]

    Local privacy and statistical minimax rates,

    J. C. Duchi, M. I. Jordan, and M. J. Wainwright, “Local privacy and statistical minimax rates,” in2013 IEEE 54th annual symposium on foundations of computer science. IEEE, 2013, pp. 429–438

  21. [21]

    Minimax optimal procedures for locally private estimation,

    ——, “Minimax optimal procedures for locally private estimation,”Journal of the American Statistical Association, vol. 113, no. 521, pp. 182–201, 2018

  22. [22]

    Lower bounds for locally private estimation via communication complexity,

    J. Duchi and R. Rogers, “Lower bounds for locally private estimation via communication complexity,” inConference on Learning Theory. PMLR, 2019, pp. 1161–1191

  23. [23]

    Algorithms for differentially private multi-armed bandits,

    A. Tossou and C. Dimitrakakis, “Algorithms for differentially private multi-armed bandits,” inProceedings of the AAAI conference on artificial intelligence, vol. 30, no. 1, 2016

  24. [24]

    Differential privacy for multi-armed bandits: What is it and what is its cost?

    D. Basu, C. Dimitrakakis, and A. Tossou, “Differential privacy for multi-armed bandits: What is it and what is its cost?”arXiv preprint arXiv:1905.12298, 2019

  25. [25]

    Multi-armed bandits with local differential privacy,

    W. Ren, X. Zhou, J. Liu, and N. B. Shroff, “Multi-armed bandits with local differential privacy,”arXiv preprint arXiv:2007.03121, 2020

  26. [26]

    Concentrated differential privacy for bandits,

    A. Azize and D. Basu, “Concentrated differential privacy for bandits,” in2024 IEEE Conference on Secure and Trustworthy Machine Learning (SaTML). IEEE, 2024, pp. 78–109

  27. [27]

    Decision making in changing environments: Robustness, query-based learning, and differential privacy,

    F. Chen and A. Rakhlin, “Decision making in changing environments: Robustness, query-based learning, and differential privacy,”arXiv preprint arXiv:2501.14928, 2025

  28. [28]

    The privacy funnel from the viewpoint of local differential privacy,

    M. Lopuha ¨a-Zwakenberg, “The privacy funnel from the viewpoint of local differential privacy,”arXiv preprint arXiv:2002.01501, 2020

  29. [29]

    On perfect privacy,

    B. Rassouli and D. G ¨und¨uz, “On perfect privacy,”IEEE Journal on Selected Areas in Information Theory, vol. 2, no. 1, pp. 177–191, 2021

  30. [30]

    Information extraction under privacy constraints,

    S. Asoodeh, M. Diaz, F. Alajaji, and T. Linder, “Information extraction under privacy constraints,”Information, vol. 7, no. 1, p. 15, 2016

  31. [31]

    On privacy-utility tradeoffs for constrained data release mechanisms,

    Y . O. Basciftci, Y . Wang, and P. Ishwar, “On privacy-utility tradeoffs for constrained data release mechanisms,” in2016 Information Theory and Applications Workshop (ITA). IEEE, 2016, pp. 1–6

  32. [32]

    Secrecy by design with applications to privacy and compression,

    Y . Y . Shkel, R. S. Blum, and H. V . Poor, “Secrecy by design with applications to privacy and compression,”IEEE Transactions on Information Theory, vol. 67, no. 2, pp. 824–843, 2020

  33. [33]

    Principal inertia components and applications,

    F. du Pin Calmon, A. Makhdoumi, M. M ´edard, M. Varia, M. Christiansen, and K. R. Duffy, “Principal inertia components and applications,”IEEE Transactions on Information Theory, vol. 63, no. 8, pp. 5011–5038, 2017

  34. [34]

    On information theoretic fairness: Compressed representations with perfect demographic parity,

    A. Zamani, B. Rodr ´ıguez-G´alvez, and M. Skoglund, “On information theoretic fairness: Compressed representations with perfect demographic parity,” in2024 IEEE Information Theory Workshop (ITW). IEEE, 2024, pp. 25–30

  35. [35]

    Variable-length coding with zero and non-zero privacy leakage,

    A. Zamani and M. Skoglund, “Variable-length coding with zero and non-zero privacy leakage,”Entropy, vol. 27, no. 2, p. 124, 2025

  36. [36]

    From the information bottleneck to the privacy funnel,

    A. Makhdoumi, S. Salamatian, N. Fawaz, and M. M ´edard, “From the information bottleneck to the privacy funnel,” in2014 IEEE Information Theory Workshop (ITW 2014). IEEE, 2014, pp. 501–505

  37. [37]

    Evaluation and analysis of the performance of the exp3 algorithm in stochastic environments,

    Y . Seldin, C. Szepesv ´ari, P. Auer, and Y . Abbasi-Yadkori, “Evaluation and analysis of the performance of the exp3 algorithm in stochastic environments,” inEuropean Workshop on Reinforcement Learning. PMLR, 2013, pp. 103–116

  38. [38]

    Unified framework of distributional regret in multi-armed bandits and reinforcement learning,

    H. Lee and M.-h. Oh, “Unified framework of distributional regret in multi-armed bandits and reinforcement learning,”arXiv preprint arXiv:2605.05102, 2026

  39. [39]

    High-probability bounds for heterogeneous local differential privacy,

    M. Aliakbarpour, A. Fallah, S. Roy, and R. Stevens, “High-probability bounds for heterogeneous local differential privacy,”arXiv preprint arXiv:2510.11895, 2025

  40. [40]

    Optimal rates of (locally) differentially private heavy-tailed multi-armed bandits,

    Y . Tao, Y . Wu, P. Zhao, and D. Wang, “Optimal rates of (locally) differentially private heavy-tailed multi-armed bandits,” inInternational Conference on Artificial Intelligence and Statistics. PMLR, 2022, pp. 1546–1574. 20 APPENDIX Proof of Theorem 1.FixM∈ MandALG∈ D, and let rM,ALG := Quantile(1−δ,P M,ALG, L). By the definition of the strict quantile, ...