Sequential 1-bit Mean Estimation with Near-Optimal Sample Complexity

Ivan Lau; Jonathan Scarlett

arxiv: 2509.21940 · v2 · submitted 2025-09-26 · 📊 stat.ML · cs.IT· cs.LG· math.IT· math.ST· stat.TH

Sequential 1-bit Mean Estimation with Near-Optimal Sample Complexity

Ivan Lau , Jonathan Scarlett This is my paper

Pith reviewed 2026-05-18 13:31 UTC · model grok-4.3

classification 📊 stat.ML cs.ITcs.LGmath.ITmath.STstat.TH

keywords mean estimation1-bit communicationsequential queriessample complexityPAC learningadaptive estimationinterval queriesdistributed statistics

0 comments

The pith

Sequential 1-bit interval queries can estimate the mean of bounded-mean bounded-variance distributions with sample complexity matching the unquantized minimax rate up to logarithmic factors plus an unavoidable log term.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a mean estimator that uses randomized interval queries chosen sequentially based on previous 1-bit outcomes. Each query asks whether a new sample falls inside a chosen interval, providing 1 bit of information. This adaptive approach achieves an (ε, δ)-PAC guarantee with sample complexity Õ(σ²/ε² log 1/δ + log λ/σ) for distributions with mean bounded by λ and variance by σ². A sympathetic reader would care because this nearly matches the best possible rate even without any communication constraints, showing that 1-bit sequential feedback loses little efficiency. The work also proves that non-adaptive estimators cannot achieve the same rate when the ratio λ/σ is large.

Core claim

The authors prove that their adaptive interval-query estimator is (ε, δ)-PAC for any distribution satisfying |E[X]| ≤ λ and Var(X) ≤ σ² using Õ(σ²/ε² log(1/δ) + log(λ/σ)) samples. This bound matches the minimax lower bound for the unquantized (full-information) setting up to logarithmic factors, with the extra log(λ/σ) term shown to be necessary even for interval queries. They further establish an adaptivity gap, where the best non-adaptive mean estimator requires substantially more samples for large λ/σ.

What carries the argument

Sequentially chosen randomized interval queries, where each 1-bit response indicates whether the sample lies in the adaptively selected interval, allowing progressive refinement of the mean estimate with minimal communication.

If this is right

The estimator works for all distributions with the given bounds on mean and variance.
It achieves near-optimality compared to full-precision sampling.
Non-adaptive interval-query estimators are provably worse when λ/σ is large.
Variants exist for unknown sampling budget and unknown variance within bounds.
Stronger tail decay allows tightened bounds.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Such methods could enable efficient distributed mean estimation in resource-constrained networks where only 1 bit per sample can be sent.
Extending this to multi-dimensional means or other statistics might follow similar adaptive query strategies.
The two-stage variant suggests limited adaptivity suffices for near-optimal performance.
Practical tests with real data varying λ/σ would show where the extra log term becomes noticeable.

Load-bearing premise

The bounds λ on the mean and σ on the standard deviation are known in advance and used to design the sequence of interval queries.

What would settle it

A calculation showing a distribution with |mean| ≤ λ and Var ≤ σ² that requires strictly more than O(log λ/σ) extra samples beyond the main σ²/ε² term, or an explicit non-adaptive estimator that matches the adaptive rate even for large λ/σ.

read the original abstract

In this paper, we study the problem of distributed mean estimation with 1-bit communication constraints. We propose a mean estimator that is based on (randomized and sequentially-chosen) interval queries, whose 1-bit outcome indicates whether the given sample lies in the specified interval. Our estimator is $(\epsilon, \delta)$-PAC for all distributions with bounded mean ($-\lambda \le \mathbb{E}(X) \le \lambda $) and variance ($\mathrm{Var}(X) \le \sigma^2$) for some known parameters $\lambda$ and $\sigma$. We derive a sample complexity bound $\widetilde{O}\big( \frac{\sigma^2}{\epsilon^2}\log\frac{1}{\delta} + \log\frac{\lambda}{\sigma}\big)$, which matches the minimax lower bound for the unquantized setting up to logarithmic factors and the additional $\log\frac{\lambda}{\sigma}$ term that we show to be unavoidable. We also establish an adaptivity gap for interval-query based estimators: the best non-adaptive mean estimator is considerably worse than our adaptive mean estimator for large $\frac{\lambda}{\sigma}$. Finally, we give tightened sample complexity bounds for distributions with stronger tail decay, and present additional variants that (i) handle an unknown sampling budget (ii) adapt to the unknown true variance given (possibly loose) upper and lower bounds on the variance, and (iii) use only two stages of adaptivity at the expense of more complicated (non-interval) queries.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 3 minor

Summary. The paper introduces a sequential adaptive mean estimator using randomized interval queries under 1-bit communication. For distributions with known bounds λ on the mean and σ on the standard deviation, it achieves (ε, δ)-PAC estimation with sample complexity Õ(σ²/ε² log(1/δ) + log(λ/σ)). This nearly matches the minimax lower bound from the unquantized setting, and the paper provides a matching lower bound showing the extra log term is unavoidable for interval-query estimators. It further establishes an adaptivity gap, provides bounds for sub-Gaussian tails, and includes variants for unknown parameters and limited adaptivity.

Significance. This contribution is significant for the field of distributed and communication-constrained statistical estimation. By achieving near-optimal sample complexity with minimal communication via adaptive queries, it shows that adaptivity can mitigate the cost of quantization. The explicit lower bound for the adaptivity gap and the extensions to unknown variance make the results more robust and applicable. Credit is due for the parameter-free aspects in the core bound and the reproducible theoretical analysis if proofs are complete.

major comments (2)

§4.1, Eq. (8): The recursive definition of the interval centers in the adaptive procedure assumes prior knowledge of σ; the variance reduction step should be shown to hold uniformly over all possible distributions satisfying the variance bound.
§6, Theorem 3: In the proof of the lower bound, the information-theoretic argument uses KL divergence between specific two-point distributions; confirm that this extends to the general case with bounded variance without additional tail assumptions.

minor comments (3)

Abstract: The notation for the sample complexity bound uses Õ; ensure consistency with the definition provided in Section 2.
Section 3: The description of the randomized interval query could benefit from a pseudocode algorithm to improve clarity.
References: Several recent works on 1-bit quantization in estimation (e.g., from 2023-2024) are missing; consider adding them for completeness.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thorough review and constructive comments on our work. We address each major comment below in detail. The manuscript has been revised to incorporate clarifications where appropriate, strengthening the presentation without altering the core results.

read point-by-point responses

Referee: §4.1, Eq. (8): The recursive definition of the interval centers in the adaptive procedure assumes prior knowledge of σ; the variance reduction step should be shown to hold uniformly over all possible distributions satisfying the variance bound.

Authors: We appreciate this observation. As stated in the problem setup (Section 2), σ is a known upper bound on the variance. The recursive construction in Eq. (8) employs this known σ to determine interval widths and centers. The variance reduction argument in the analysis of the adaptive procedure relies solely on the bound Var(X) ≤ σ² together with the mean bound |E[X]| ≤ λ; it does not require knowledge of the exact variance or any further distributional properties. Consequently, the guarantees hold uniformly over the entire class of distributions satisfying these moment bounds. We have added a clarifying remark immediately following Eq. (8) to make this uniformity explicit. revision: yes
Referee: §6, Theorem 3: In the proof of the lower bound, the information-theoretic argument uses KL divergence between specific two-point distributions; confirm that this extends to the general case with bounded variance without additional tail assumptions.

Authors: The lower bound of Theorem 3 applies to the minimax risk over all distributions with |E[X]| ≤ λ and Var(X) ≤ σ². The proof proceeds by exhibiting a pair of two-point distributions that lie inside this class (each with variance exactly equal to σ²) and applying the standard KL-based information-theoretic argument to this pair. Because the minimax quantity is at least as large as the risk on any subclass, the resulting lower bound carries over directly to the full class. No tail assumptions beyond the variance bound are invoked. We have inserted a short paragraph at the beginning of the proof of Theorem 3 to emphasize that the constructed distributions satisfy the problem constraints and that the argument requires no additional assumptions. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper proposes an adaptive estimator using randomized sequential interval queries and derives its sample complexity upper bound through direct analysis of the estimator's performance under the stated bounded-mean and bounded-variance assumptions. This upper bound is then compared to an external minimax lower bound known for the unquantized mean estimation problem, with the extra log(λ/σ) term justified by a separate lower-bound argument for interval-query estimators. No steps reduce by construction to fitted parameters, self-definitions, or load-bearing self-citations; the derivation remains self-contained against external benchmarks and does not rename or smuggle in prior results via citation chains.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on standard probabilistic concentration tools and the assumption of known bounds λ and σ to enable adaptive interval selection; no new entities or fitted parameters are introduced in the abstract.

axioms (1)

standard math Standard concentration inequalities suffice to analyze the (ε, δ)-PAC guarantee of the sequential estimator.
The sample complexity derivation relies on probabilistic tail bounds for the adaptive querying process.

pith-pipeline@v0.9.0 · 5811 in / 1255 out tokens · 51787 ms · 2026-05-18T13:31:04.559596+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We derive a sample complexity bound Õ(σ²/ε² log 1/δ + log λ/σ), which matches the minimax lower bound for the unquantized setting up to logarithmic factors and the additional log λ/σ term
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Our estimator is (ε, δ)-PAC for all distributions with bounded mean (−λ ≤ E(X) ≤ λ) and variance (Var(X) ≤ σ²)

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

56 extracted references · 56 canonical work pages

[1]

Acharya, J., Canonne, C., Liu, Y., Sun, Z., and Tyagi, H. (2021a). Distributed estimation with multiple samples per user: Sharp rates and phase transition. In Advances in Neural Information Processing Systems , volume 34, pages 18920--18931

work page
[2]

L., Liu, Y., Sun, Z., and Tyagi, H

Acharya, J., Canonne, C. L., Liu, Y., Sun, Z., and Tyagi, H. (2021b). Interactive inference under information constraints. IEEE Transactions on Information Theory , 68(1):502--516

work page
[3]

L., Singh, A

Acharya, J., Canonne, C. L., Singh, A. V., and Tyagi, H. (2021c). Optimal rates for nonparametric density estimation under communication constraints. CoRR , abs/2107.10078

work page arXiv
[4]

L., Sun, Z., and Tyagi, H

Acharya, J., Canonne, C. L., Sun, Z., and Tyagi, H. (2023). Unified lower bounds for interactive high-dimensional estimation under information constraints. Advances in Neural Information Processing Systems , 36:51133--51165

work page 2023
[5]

L., and Tyagi, H

Acharya, J., Canonne, C. L., and Tyagi, H. (2020a). Inference under information constraints I : L ower bounds from chi-square contraction. IEEE Transactions on Information Theory , 66(12):7835--7855

work page
[6]

L., and Tyagi, H

Acharya, J., Canonne, C. L., and Tyagi, H. (2020b). Inference under information constraints II: communication constraints and shared randomness. IEEE Transactions on Information Theory , 66(12):7856--7877

work page
[7]

Acharya, J., Kairouz, P., Liu, Y., and Sun, Z. (2021d). Estimating sparse discrete distributions under privacy and communication constraints. In Algorithmic Learning Theory (ALT) , pages 79--98. PMLR

work page
[8]

S., Kumar, R., and Vatedka, S

Babu, N. S., Kumar, R., and Vatedka, S. (2025). Unbiased quantization of the l_1 ball for communication-efficient distributed mean estimation. In International Conference on Artificial Intelligence and Statistics (AISTATS) , volume 258, pages 1270--1278. PMLR

work page 2025
[9]

P., Han, Y., and \" O zg \" u r, A

Barnes, L. P., Han, Y., and \" O zg \" u r, A. (2019). F isher information for distributed estimation under a blackboard communication protocol. In IEEE International Symposium on Information Theory (ISIT) , pages 2704--2708. IEEE

work page 2019
[10]

P., Han, Y., and \" O zg\" u r, A

Barnes, L. P., Han, Y., and \" O zg\" u r, A. (2020). Lower bounds for learning distributions under communication constraints via F isher information. Journal of Machine Learning Research , 21:1--30

work page 2020
[11]

Ben-Basat, R., Vargaftik, S., Portnoy, A., Einziger, G., Ben-Itzhak, Y., and Mitzenmacher, M. (2024). Accelerating federated learning with quick distributed mean estimation. In International Conference on Machine Learning (ICML) , volume 235, pages 3410--3442. PMLR

work page 2024
[12]

Boucheron, S., Lugosi, G., and Massart, P. (2013). Concentration Inequalities: A Nonasymptotic Theory of Independence . Oxford University Press

work page 2013
[13]

L., and Woodruff, D

Braverman, M., Garg, A., Ma, T., Nguyen, H. L., and Woodruff, D. P. (2016). Communication lower bounds for statistical estimation problems via a distributed data processing inequality. In Symposium on Theory of Computing Conference, STOC'16 , pages 1011--1020. ACM

work page 2016
[14]

Cai, T. T. and Wei, H. (2022a). Distributed adaptive G aussian mean estimation with unknown variance: Interactive protocol helps adaptation. The Annals of Statistics , 50(4):1992--2020

work page 1992
[15]

Cai, T. T. and Wei, H. (2022b). Distributed nonparametric function estimation: Optimal rate of convergence and cost of adaptation. The Annals of Statistics , 50(2):698--725

work page
[16]

Cai, T. T. and Wei, H. (2024). Distributed G aussian mean estimation under communication constraints: O ptimal rates and communication-efficient algorithms. Journal of Machine Learning Research , 25(37):1--63

work page 2024
[17]

Cherapanamjeri, Y., Tripuraneni, N., Bartlett, P., and Jordan, M. (2022). Optimal mean estimation without a variance. In Conference on Learning Theory , pages 356--357. PMLR

work page 2022
[18]

Dang, T., Lee, J., Song, M., and Valiant, P. (2023). Optimality in mean estimation: Beyond worst-case, beyond sub- G aussian, and beyond 1+ moments. In Advances in Neural Information Processing Systems , volume 36, pages 4150--4176

work page 2023
[19]

Davies, P., Gurunanthan, V., Moshrefi, N., Ashkboos, S., and Alistarh, D. (2021). New bounds for distributed mean estimation and variance reduction. In International Conference on Learning Representations, (ICLR)

work page 2021
[20]

Garg, A., Ma, T., and Nguyen, H. L. (2014). On communication cost of distributed statistical estimation and dimensionality. In Advances in Neural Information Processing Systems 27 , pages 2726--2734

work page 2014
[21]

and Price, E

Gretta, L. and Price, E. (2024). Sharp Noisy Binary Search with Monotonic Probabilities . In 51st International Colloquium on Automata, Languages, and Programming (ICALP) , volume 297, pages 75:1--75:19

work page 2024
[22]

Gupta, S., Hopkins, S., and Price, E. (2024). Beyond catoni: Sharper rates for heavy-tailed and robust mean estimation. In Conference on Learning Theory (COLT) , pages 2232--2269. PMLR

work page 2024
[23]

Han, Y., Mukherjee, P., \"O zg \"u r , A., and Weissman, T. (2018a). Distributed statistical estimation of high-dimensional and non-parametric distributions. In IEEE International Symposium on Information Theory (ISIT) , pages 506--510

work page
[24]

Han, Y., \" O zg \" u r, A., and Weissman, T. (2018b). Geometric lower bounds for distributed parameter estimation under communication constraints. In Conference on Learning Theory (COLT) , volume 75, pages 3163--3188. PMLR

work page
[25]

A., Yang, L., and Fragouli, C

Hanna, O. A., Yang, L., and Fragouli, C. (2022). Solving Multi-Arm Bandit Using a Few Bits of Communication . In International Conference on Artificial Intelligence and Statistics (AISTATS) , pages 11215--11236

work page 2022
[26]

He, S., Shin, H.-S., Xu, S., and Tsourdos, A. (2020). Distributed estimation over a low-cost sensor network: A review of state-of-the-art. Information Fusion , 54:21--43

work page 2020
[27]

and Duchi, J

Kipnis, A. and Duchi, J. C. (2022). Mean estimation from one-bit measurements. IEEE Transactions on Information Theory , 68(9):6276--6296

work page 2022
[28]

and Richt \'a rik, P

Kone c n \`y , J. and Richt \'a rik, P. (2018). Randomized distributed mean estimation: Accuracy vs. communication. Frontiers in Applied Mathematics and Statistics , 4:62

work page 2018
[29]

and Vatedka, S

Kumar, R. and Vatedka, S. (2025). One-bit distributed mean estimation with unknown variance. arXiv preprint arXiv:2501.18502

work page arXiv 2025
[30]

and Scarlett, J

Lau, I. and Scarlett, J. (2025). Quantile multi-armed bandits with 1-bit feedback. In Algorithmic Learning Theory , pages 664--699. PMLR

work page 2025
[31]

Lee, J. (2020). CSCI 1951-W Sublinear Algorithms for Big Data, Lecture 11 . https://cs.brown.edu/courses/csci1951-w/lec/lec Accessed: 2025-07-01

work page 2020
[32]

Lee, J. C. and Valiant, P. (2022). Optimal sub- G aussian mean estimation in R . In IEEE Annual Symposium on Foundations of Computer Science (FOCS) , pages 672--683. IEEE

work page 2022
[33]

and Mendelson, S

Lugosi, G. and Mendelson, S. (2019). Sub- G aussian estimators of the mean of a random vector. The Annals of Statistics , 47(2):783 -- 794

work page 2019
[34]

Luo, Z.-Q. (2005). Universal decentralized estimation in a bandwidth constrained sensor network. IEEE Transactions on Information Theory , 51(6):2210--2219

work page 2005
[35]

Mayekar, P., Scarlett, J., and Tan, V. Y. (2023). Communication-constrained bandits under additive G aussian noise . In International Conference on Machine Learning (ICML) , pages 24236--24250

work page 2023
[36]

T., and Tyagi, H

Mayekar, P., Suresh, A. T., and Tyagi, H. (2021). Wyner-ziv estimators: Efficient distributed mean estimation with side-information. In International Conference on Artificial Intelligence and Statistics (AISTATS) , pages 3502--3510. PMLR

work page 2021
[37]

Minsker, S. (2023). Efficient median of means estimator. In Conference on Learning Theory (COLT) , pages 5925--5933. PMLR

work page 2023
[38]

and Wu, Y

Polyanskiy, Y. and Wu, Y. (2025). Information Theory: From Coding to Learning . Cambridge University Press

work page 2025
[39]

and Giannakis, G

Ribeiro, A. and Giannakis, G. B. (2006a). Bandwidth-constrained distributed estimation for wireless sensor networks-part i: G aussian case. IEEE Transactions on Signal Processing , 54(3):1131--1143

work page
[40]

and Giannakis, G

Ribeiro, A. and Giannakis, G. B. (2006b). Bandwidth-constrained distributed estimation for wireless sensor networks-part ii: U nknown probability density function. IEEE Transactions on Signal Processing , 54(7):2784--2796

work page
[41]

Shah, J., Cardone, M., Rush, C., and Dytso, A. (2025). Generalized linear models with 1-bit measurements: Asymptotics of the maximum likelihood estimator. In International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pages 1--5. IEEE

work page 2025
[42]

Shamir, O. (2014). Fundamental limits of online and distributed algorithms for statistical learning and estimation. In Advances in Neural Information Processing Systems 27 , pages 163--171

work page 2014
[43]

T., Felix, X

Suresh, A. T., Felix, X. Y., Kumar, S., and McMahan, H. B. (2017). Distributed mean estimation with limited communication. In International Conference on Machine Learning (ICML) , pages 3329--3337. PMLR

work page 2017
[44]

T., Sun, Z., Ro, J., and Yu, F

Suresh, A. T., Sun, Z., Ro, J., and Yu, F. (2022). Correlated quantization for distributed mean estimation and optimization. In International Conference on Machine Learning , pages 20856--20876. PMLR

work page 2022
[45]

and van Zanten, H

Szab \'o , B. and van Zanten, H. (2018). Adaptive distributed methods under communication constraints. The Annals of Statistics

work page 2018
[46]

and van Zanten, H

Szab \'o , B. and van Zanten, H. (2020). Distributed function estimation: Adaptation using minimal communication. Mathematical Statistics and Learning

work page 2020
[47]

B., Portnoy, A., Mendelson, G., Itzhak, Y

Vargaftik, S., Basat, R. B., Portnoy, A., Mendelson, G., Itzhak, Y. B., and Mitzenmacher, M. (2022). EDEN : Communication-efficient and robust distributed mean estimation for federated learning. In International Conference on Machine Learning (ICML) , volume 162, pages 21984--22014. PMLR

work page 2022
[48]

Vargaftik, S., Ben-Basat, R., Portnoy, A., Mendelson, G., Ben-Itzhak, Y., and Mitzenmacher, M. (2021). Drive: One-bit distributed mean estimation. In Advances in Neural Information Processing Systems , volume 34, pages 362--377

work page 2021
[49]

Varshney, P. K. (2012). Distributed detection and data fusion . Springer Science & Business Media

work page 2012
[50]

Veeravalli, V. V. and Varshney, P. K. (2012). Distributed inference in wireless sensor networks. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences , 370(1958):100--117

work page 2012
[51]

Vershynin, R. (2026). High-Dimensional Probability: An Introduction with Applications in Data Science . Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 2 edition

work page 2026
[52]

Xiao, J.-J., Ribeiro, A., Luo, Z.-Q., and Giannakis, G. B. (2006). Distributed compression-estimation using wireless sensor networks. IEEE Signal Processing Magazine , 23(4):27--41

work page 2006
[53]

and Raginsky, M

Xu, A. and Raginsky, M. (2017). Information-theoretic lower bounds on Bayes risk in decentralized estimation . IEEE Transactions on Information Theory , 63(3):1580--1600

work page 2017
[54]

and Szab \'o , B

Zaman, A. and Szab \'o , B. (2022). Distributed nonparametric estimation under communication constraints. arXiv preprint arXiv:2204.10373

work page arXiv 2022
[55]

I., and Wainwright, M

Zhang, Y., Duchi, J., Jordan, M. I., and Wainwright, M. J. (2013). Information-theoretic lower bounds for distributed statistical estimation with communication constraints. In Advances in Neural Information Processing Systems 26 , pages 2328--2336

work page 2013
[56]

and Lafferty, J

Zhu, Y. and Lafferty, J. (2018). Distributed nonparametric regression under communication constraints. In International Conference on Machine Learning (ICML) , pages 6009--6017. PMLR

work page 2018

[1] [1]

Acharya, J., Canonne, C., Liu, Y., Sun, Z., and Tyagi, H. (2021a). Distributed estimation with multiple samples per user: Sharp rates and phase transition. In Advances in Neural Information Processing Systems , volume 34, pages 18920--18931

work page

[2] [2]

L., Liu, Y., Sun, Z., and Tyagi, H

Acharya, J., Canonne, C. L., Liu, Y., Sun, Z., and Tyagi, H. (2021b). Interactive inference under information constraints. IEEE Transactions on Information Theory , 68(1):502--516

work page

[3] [3]

L., Singh, A

Acharya, J., Canonne, C. L., Singh, A. V., and Tyagi, H. (2021c). Optimal rates for nonparametric density estimation under communication constraints. CoRR , abs/2107.10078

work page arXiv

[4] [4]

L., Sun, Z., and Tyagi, H

Acharya, J., Canonne, C. L., Sun, Z., and Tyagi, H. (2023). Unified lower bounds for interactive high-dimensional estimation under information constraints. Advances in Neural Information Processing Systems , 36:51133--51165

work page 2023

[5] [5]

L., and Tyagi, H

Acharya, J., Canonne, C. L., and Tyagi, H. (2020a). Inference under information constraints I : L ower bounds from chi-square contraction. IEEE Transactions on Information Theory , 66(12):7835--7855

work page

[6] [6]

L., and Tyagi, H

Acharya, J., Canonne, C. L., and Tyagi, H. (2020b). Inference under information constraints II: communication constraints and shared randomness. IEEE Transactions on Information Theory , 66(12):7856--7877

work page

[7] [7]

Acharya, J., Kairouz, P., Liu, Y., and Sun, Z. (2021d). Estimating sparse discrete distributions under privacy and communication constraints. In Algorithmic Learning Theory (ALT) , pages 79--98. PMLR

work page

[8] [8]

S., Kumar, R., and Vatedka, S

Babu, N. S., Kumar, R., and Vatedka, S. (2025). Unbiased quantization of the l_1 ball for communication-efficient distributed mean estimation. In International Conference on Artificial Intelligence and Statistics (AISTATS) , volume 258, pages 1270--1278. PMLR

work page 2025

[9] [9]

P., Han, Y., and \" O zg \" u r, A

Barnes, L. P., Han, Y., and \" O zg \" u r, A. (2019). F isher information for distributed estimation under a blackboard communication protocol. In IEEE International Symposium on Information Theory (ISIT) , pages 2704--2708. IEEE

work page 2019

[10] [10]

P., Han, Y., and \" O zg\" u r, A

Barnes, L. P., Han, Y., and \" O zg\" u r, A. (2020). Lower bounds for learning distributions under communication constraints via F isher information. Journal of Machine Learning Research , 21:1--30

work page 2020

[11] [11]

Ben-Basat, R., Vargaftik, S., Portnoy, A., Einziger, G., Ben-Itzhak, Y., and Mitzenmacher, M. (2024). Accelerating federated learning with quick distributed mean estimation. In International Conference on Machine Learning (ICML) , volume 235, pages 3410--3442. PMLR

work page 2024

[12] [12]

Boucheron, S., Lugosi, G., and Massart, P. (2013). Concentration Inequalities: A Nonasymptotic Theory of Independence . Oxford University Press

work page 2013

[13] [13]

L., and Woodruff, D

Braverman, M., Garg, A., Ma, T., Nguyen, H. L., and Woodruff, D. P. (2016). Communication lower bounds for statistical estimation problems via a distributed data processing inequality. In Symposium on Theory of Computing Conference, STOC'16 , pages 1011--1020. ACM

work page 2016

[14] [14]

Cai, T. T. and Wei, H. (2022a). Distributed adaptive G aussian mean estimation with unknown variance: Interactive protocol helps adaptation. The Annals of Statistics , 50(4):1992--2020

work page 1992

[15] [15]

Cai, T. T. and Wei, H. (2022b). Distributed nonparametric function estimation: Optimal rate of convergence and cost of adaptation. The Annals of Statistics , 50(2):698--725

work page

[16] [16]

Cai, T. T. and Wei, H. (2024). Distributed G aussian mean estimation under communication constraints: O ptimal rates and communication-efficient algorithms. Journal of Machine Learning Research , 25(37):1--63

work page 2024

[17] [17]

Cherapanamjeri, Y., Tripuraneni, N., Bartlett, P., and Jordan, M. (2022). Optimal mean estimation without a variance. In Conference on Learning Theory , pages 356--357. PMLR

work page 2022

[18] [18]

Dang, T., Lee, J., Song, M., and Valiant, P. (2023). Optimality in mean estimation: Beyond worst-case, beyond sub- G aussian, and beyond 1+ moments. In Advances in Neural Information Processing Systems , volume 36, pages 4150--4176

work page 2023

[19] [19]

Davies, P., Gurunanthan, V., Moshrefi, N., Ashkboos, S., and Alistarh, D. (2021). New bounds for distributed mean estimation and variance reduction. In International Conference on Learning Representations, (ICLR)

work page 2021

[20] [20]

Garg, A., Ma, T., and Nguyen, H. L. (2014). On communication cost of distributed statistical estimation and dimensionality. In Advances in Neural Information Processing Systems 27 , pages 2726--2734

work page 2014

[21] [21]

and Price, E

Gretta, L. and Price, E. (2024). Sharp Noisy Binary Search with Monotonic Probabilities . In 51st International Colloquium on Automata, Languages, and Programming (ICALP) , volume 297, pages 75:1--75:19

work page 2024

[22] [22]

Gupta, S., Hopkins, S., and Price, E. (2024). Beyond catoni: Sharper rates for heavy-tailed and robust mean estimation. In Conference on Learning Theory (COLT) , pages 2232--2269. PMLR

work page 2024

[23] [23]

Han, Y., Mukherjee, P., \"O zg \"u r , A., and Weissman, T. (2018a). Distributed statistical estimation of high-dimensional and non-parametric distributions. In IEEE International Symposium on Information Theory (ISIT) , pages 506--510

work page

[24] [24]

Han, Y., \" O zg \" u r, A., and Weissman, T. (2018b). Geometric lower bounds for distributed parameter estimation under communication constraints. In Conference on Learning Theory (COLT) , volume 75, pages 3163--3188. PMLR

work page

[25] [25]

A., Yang, L., and Fragouli, C

Hanna, O. A., Yang, L., and Fragouli, C. (2022). Solving Multi-Arm Bandit Using a Few Bits of Communication . In International Conference on Artificial Intelligence and Statistics (AISTATS) , pages 11215--11236

work page 2022

[26] [26]

He, S., Shin, H.-S., Xu, S., and Tsourdos, A. (2020). Distributed estimation over a low-cost sensor network: A review of state-of-the-art. Information Fusion , 54:21--43

work page 2020

[27] [27]

and Duchi, J

Kipnis, A. and Duchi, J. C. (2022). Mean estimation from one-bit measurements. IEEE Transactions on Information Theory , 68(9):6276--6296

work page 2022

[28] [28]

and Richt \'a rik, P

Kone c n \`y , J. and Richt \'a rik, P. (2018). Randomized distributed mean estimation: Accuracy vs. communication. Frontiers in Applied Mathematics and Statistics , 4:62

work page 2018

[29] [29]

and Vatedka, S

Kumar, R. and Vatedka, S. (2025). One-bit distributed mean estimation with unknown variance. arXiv preprint arXiv:2501.18502

work page arXiv 2025

[30] [30]

and Scarlett, J

Lau, I. and Scarlett, J. (2025). Quantile multi-armed bandits with 1-bit feedback. In Algorithmic Learning Theory , pages 664--699. PMLR

work page 2025

[31] [31]

Lee, J. (2020). CSCI 1951-W Sublinear Algorithms for Big Data, Lecture 11 . https://cs.brown.edu/courses/csci1951-w/lec/lec Accessed: 2025-07-01

work page 2020

[32] [32]

Lee, J. C. and Valiant, P. (2022). Optimal sub- G aussian mean estimation in R . In IEEE Annual Symposium on Foundations of Computer Science (FOCS) , pages 672--683. IEEE

work page 2022

[33] [33]

and Mendelson, S

Lugosi, G. and Mendelson, S. (2019). Sub- G aussian estimators of the mean of a random vector. The Annals of Statistics , 47(2):783 -- 794

work page 2019

[34] [34]

Luo, Z.-Q. (2005). Universal decentralized estimation in a bandwidth constrained sensor network. IEEE Transactions on Information Theory , 51(6):2210--2219

work page 2005

[35] [35]

Mayekar, P., Scarlett, J., and Tan, V. Y. (2023). Communication-constrained bandits under additive G aussian noise . In International Conference on Machine Learning (ICML) , pages 24236--24250

work page 2023

[36] [36]

T., and Tyagi, H

Mayekar, P., Suresh, A. T., and Tyagi, H. (2021). Wyner-ziv estimators: Efficient distributed mean estimation with side-information. In International Conference on Artificial Intelligence and Statistics (AISTATS) , pages 3502--3510. PMLR

work page 2021

[37] [37]

Minsker, S. (2023). Efficient median of means estimator. In Conference on Learning Theory (COLT) , pages 5925--5933. PMLR

work page 2023

[38] [38]

and Wu, Y

Polyanskiy, Y. and Wu, Y. (2025). Information Theory: From Coding to Learning . Cambridge University Press

work page 2025

[39] [39]

and Giannakis, G

Ribeiro, A. and Giannakis, G. B. (2006a). Bandwidth-constrained distributed estimation for wireless sensor networks-part i: G aussian case. IEEE Transactions on Signal Processing , 54(3):1131--1143

work page

[40] [40]

and Giannakis, G

Ribeiro, A. and Giannakis, G. B. (2006b). Bandwidth-constrained distributed estimation for wireless sensor networks-part ii: U nknown probability density function. IEEE Transactions on Signal Processing , 54(7):2784--2796

work page

[41] [41]

Shah, J., Cardone, M., Rush, C., and Dytso, A. (2025). Generalized linear models with 1-bit measurements: Asymptotics of the maximum likelihood estimator. In International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pages 1--5. IEEE

work page 2025

[42] [42]

Shamir, O. (2014). Fundamental limits of online and distributed algorithms for statistical learning and estimation. In Advances in Neural Information Processing Systems 27 , pages 163--171

work page 2014

[43] [43]

T., Felix, X

Suresh, A. T., Felix, X. Y., Kumar, S., and McMahan, H. B. (2017). Distributed mean estimation with limited communication. In International Conference on Machine Learning (ICML) , pages 3329--3337. PMLR

work page 2017

[44] [44]

T., Sun, Z., Ro, J., and Yu, F

Suresh, A. T., Sun, Z., Ro, J., and Yu, F. (2022). Correlated quantization for distributed mean estimation and optimization. In International Conference on Machine Learning , pages 20856--20876. PMLR

work page 2022

[45] [45]

and van Zanten, H

Szab \'o , B. and van Zanten, H. (2018). Adaptive distributed methods under communication constraints. The Annals of Statistics

work page 2018

[46] [46]

and van Zanten, H

Szab \'o , B. and van Zanten, H. (2020). Distributed function estimation: Adaptation using minimal communication. Mathematical Statistics and Learning

work page 2020

[47] [47]

B., Portnoy, A., Mendelson, G., Itzhak, Y

Vargaftik, S., Basat, R. B., Portnoy, A., Mendelson, G., Itzhak, Y. B., and Mitzenmacher, M. (2022). EDEN : Communication-efficient and robust distributed mean estimation for federated learning. In International Conference on Machine Learning (ICML) , volume 162, pages 21984--22014. PMLR

work page 2022

[48] [48]

Vargaftik, S., Ben-Basat, R., Portnoy, A., Mendelson, G., Ben-Itzhak, Y., and Mitzenmacher, M. (2021). Drive: One-bit distributed mean estimation. In Advances in Neural Information Processing Systems , volume 34, pages 362--377

work page 2021

[49] [49]

Varshney, P. K. (2012). Distributed detection and data fusion . Springer Science & Business Media

work page 2012

[50] [50]

Veeravalli, V. V. and Varshney, P. K. (2012). Distributed inference in wireless sensor networks. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences , 370(1958):100--117

work page 2012

[51] [51]

Vershynin, R. (2026). High-Dimensional Probability: An Introduction with Applications in Data Science . Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 2 edition

work page 2026

[52] [52]

Xiao, J.-J., Ribeiro, A., Luo, Z.-Q., and Giannakis, G. B. (2006). Distributed compression-estimation using wireless sensor networks. IEEE Signal Processing Magazine , 23(4):27--41

work page 2006

[53] [53]

and Raginsky, M

Xu, A. and Raginsky, M. (2017). Information-theoretic lower bounds on Bayes risk in decentralized estimation . IEEE Transactions on Information Theory , 63(3):1580--1600

work page 2017

[54] [54]

and Szab \'o , B

Zaman, A. and Szab \'o , B. (2022). Distributed nonparametric estimation under communication constraints. arXiv preprint arXiv:2204.10373

work page arXiv 2022

[55] [55]

I., and Wainwright, M

Zhang, Y., Duchi, J., Jordan, M. I., and Wainwright, M. J. (2013). Information-theoretic lower bounds for distributed statistical estimation with communication constraints. In Advances in Neural Information Processing Systems 26 , pages 2328--2336

work page 2013

[56] [56]

and Lafferty, J

Zhu, Y. and Lafferty, J. (2018). Distributed nonparametric regression under communication constraints. In International Conference on Machine Learning (ICML) , pages 6009--6017. PMLR

work page 2018