Recognition: unknown
Scaling Federated Linear Contextual Bandits via Sketching
Pith reviewed 2026-05-09 20:14 UTC · model grok-4.3
The pith
Federated sketching reduces high-dimensional bandit costs while matching optimal regret.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
FSCLB uses SVD to compute the required determinant indirectly via the sketch, achieving O(l^2 d) time instead of O(d^3), and employs a double-sketch to reduce uploads and downloads to O(l d). By substituting the sketch for the covariance matrix in the communication decision, it preserves the local increment property and asynchronous condition. The resulting regret is O~((sqrt(d) + sqrt(M ε_l)) sqrt(l T)), which equals the optimal O~(sqrt(l d T)) when l is large enough to capture the full rank of the covariance.
What carries the argument
Double-sketch strategy with SVD-based determinant and sketch substitution for communication decisions in FSCLB.
Load-bearing premise
Substituting the sketch matrix for the covariance matrix when deciding communication preserves both the asynchronous condition and the validity of the regret analysis without introducing uncontrolled bias.
What would settle it
A test where l is set below the known rank of the covariance on full-rank data, checking if the observed regret exceeds the claimed bound or if the asynchronous property breaks.
Figures
read the original abstract
In federated contextual linear bandits, high data dimensionality incurs prohibitive computation and communication costs: local agents perform $O(d^3)$-time determinant computation and upload $O(d^2)$ parameters, making existing algorithms unscalable, where $d$ is the dimension of data. To relieve these scaling bottlenecks, this paper proposes Federated Sketch Contextual Linear Bandits (FSCLB). On the computation side, FSCLB uses SVD to indirectly obtain the determinant required for communication, eliminating the prohibitive cost of direct determinant calculation and cutting complexity from $O(d^3)$ to $O(l^2d)$ per round, where $l< d$ is the sketch size. On the communication side, FSCLB introduces a double-sketch strategy that reduces both upload and download costs from $O(d^2)$ to $O(ld)$. Naively involving sketch update into federated contextual linear bandits can destroy the local increment and invalidate the asynchronous communication condition; FSCLB solves this by replacing the covariance matrix with the sketch matrix when deciding whether to communicate. Theoretically, FSCLB achieves a regret bound of $\widetilde{O} ((\sqrt{d}+\sqrt{M\varepsilon_l})\sqrt{lT})$, where $\varepsilon_l$ is the upper bounded by the spectral tail of the covariance matrix; when $l$ exceeds the rank of the covariance matrix, the bound simplifies to $\widetilde{O}(\sqrt{ldT})$, matching the optimal no-sketch regret. Experiments on both synthetic and real-world datasets show that FSCLB significantly reduces computational and communication costs by over 90 \% while sacrificing only a negligible amount of cumulative reward.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes Federated Sketch Contextual Linear Bandits (FSCLB) to scale federated linear contextual bandits to high dimensions. It uses SVD-based sketching to reduce per-round determinant computation from O(d^3) to O(l^2 d) and a double-sketch strategy to cut communication from O(d^2) to O(l d). To preserve the asynchronous communication condition that naive sketching would destroy, FSCLB substitutes the sketch matrix for the covariance matrix when deciding whether to communicate. The central theoretical claim is a regret bound of O~((√d + √(M ε_l)) √(l T)), where ε_l bounds the spectral tail of the covariance; when l exceeds the matrix rank the bound simplifies to O~(√(l d T)), matching the optimal no-sketch regret. Experiments on synthetic and real data report >90% reductions in compute and communication cost with negligible reward loss.
Significance. If the regret bound is rigorously shown to hold after the sketch substitution and the empirical gains are reproducible with proper baselines and statistics, the work would be significant for practical high-dimensional federated bandits. The double-sketch approach and the explicit fix for the asynchronous condition are concrete contributions; the parameter-free simplification of the bound when l is large enough would be a notable strength if the derivation is complete.
major comments (3)
- [§4, Theorem 1] §4 (main regret theorem): the bound O~((√d + √(M ε_l)) √(l T)) is derived from standard linear-bandit analysis plus an additive sketch-error term, yet the manuscript provides no explicit steps showing that substituting the sketch matrix for the covariance matrix in the communication decision preserves the exact timing and validity of local increments required by the asynchronous analysis. Without this verification, the claim that the bound reduces to the optimal no-sketch rate when l exceeds rank cannot be confirmed.
- [§3.2] §3.2 (algorithm description): the text states that naive sketching 'destroys the local increment and invalidates the asynchronous communication condition,' then asserts the replacement solves it, but does not quantify any additional bias or change in communication frequency introduced by the modified decision rule. This substitution is load-bearing for the subsequent regret analysis.
- [Experiments] Experimental section: the reported >90% cost reductions and 'negligible' reward sacrifice are presented without baseline comparisons to the original (non-sketch) federated algorithm, without error bars or statistical tests on cumulative reward, and without details on the number of independent runs, leaving the empirical support for the central scaling claim incomplete.
minor comments (2)
- [Abstract and §4] Notation for M (number of agents) and ε_l (spectral tail) is introduced in the abstract and theorem but not defined with a precise equation reference in the main text.
- [Figures 1-3] Figure captions and table headers should explicitly state the sketch size l used in each plotted curve to allow direct comparison with the theoretical regime l > rank.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed comments. We address each major point below and will revise the manuscript to incorporate clarifications and additional material where appropriate.
read point-by-point responses
-
Referee: [§4, Theorem 1] §4 (main regret theorem): the bound O~((√d + √(M ε_l)) √(l T)) is derived from standard linear-bandit analysis plus an additive sketch-error term, yet the manuscript provides no explicit steps showing that substituting the sketch matrix for the covariance matrix in the communication decision preserves the exact timing and validity of local increments required by the asynchronous analysis. Without this verification, the claim that the bound reduces to the optimal no-sketch rate when l exceeds rank cannot be confirmed.
Authors: We agree that the proof presentation would benefit from greater explicitness. The substitution replaces the covariance matrix with its sketch only in the communication trigger while retaining the original local update rule for the covariance itself; because the sketch error is controlled by the spectral tail ε_l (already present in the regret statement), the timing of communication rounds remains within the same asymptotic regime as the unsketched analysis. In the revision we will insert a short supporting lemma (placed in the appendix and referenced from §4) that formally verifies preservation of the local-increment and asynchrony conditions under this substitution, thereby confirming that the bound collapses to the optimal Õ(√(l d T)) rate once l exceeds the covariance rank. revision: yes
-
Referee: [§3.2] §3.2 (algorithm description): the text states that naive sketching 'destroys the local increment and invalidates the asynchronous communication condition,' then asserts the replacement solves it, but does not quantify any additional bias or change in communication frequency introduced by the modified decision rule. This substitution is load-bearing for the subsequent regret analysis.
Authors: The sketch-based trigger introduces an additive perturbation bounded by ε_l in the determinant test; this perturbation affects only the decision threshold and does not alter the underlying local covariance increments. We will expand §3.2 with a short paragraph that bounds the resulting difference in communication frequency (at most O(√(T) ε_l) extra rounds in expectation) and shows that any extra bias is absorbed into the existing additive sketch-error term of the regret bound. revision: yes
-
Referee: [Experiments] Experimental section: the reported >90% cost reductions and 'negligible' reward sacrifice are presented without baseline comparisons to the original (non-sketch) federated algorithm, without error bars or statistical tests on cumulative reward, and without details on the number of independent runs, leaving the empirical support for the central scaling claim incomplete.
Authors: We accept that the experimental section requires strengthening. In the revision we will (i) add direct runtime and communication comparisons against the original non-sketch federated linear contextual bandit algorithm, (ii) report means and standard deviations over at least five independent random seeds with error bars on all cumulative-reward plots, and (iii) include paired t-tests (or Wilcoxon tests) to quantify statistical significance of the observed cost savings and reward differences. revision: yes
Circularity Check
Regret bound derived from standard linear-bandit analysis plus explicit sketch-error term
full rationale
The paper states a regret bound of O~((√d + √(M ε_l)) √(l T)) that reduces to O~(√(l d T)) when l exceeds rank because ε_l vanishes. This follows from incorporating the spectral-tail approximation error of the sketch into the usual analysis of linear contextual bandits under asynchronous communication; the bound does not reduce to any fitted parameter, self-referential definition, or self-citation chain. The double-sketch strategy and covariance-to-sketch substitution are presented as algorithmic fixes whose error is explicitly bounded by ε_l rather than assumed away. No load-bearing step equates the claimed result to its inputs by construction.
Axiom & Free-Parameter Ledger
free parameters (1)
- sketch size l
axioms (2)
- domain assumption The covariance matrix admits a low-rank or rapidly decaying spectrum so that a sketch of size l yields small spectral tail epsilon_l.
- standard math Standard linear contextual bandit assumptions (bounded contexts, sub-Gaussian noise, etc.) continue to hold after sketching.
Reference graph
Works this paper leans on
-
[1]
Using confidence bounds for exploitation-exploration trade-offs.J
Peter Auer. Using confidence bounds for exploitation-exploration trade-offs.J. Mach. Learn. Res., 3(null):397–422, March 2003
2003
-
[2]
Cambridge University Press, 2020
Tor Lattimore and Csaba Szepesvári.Bandit algorithms. Cambridge University Press, 2020
2020
-
[3]
Improved algorithms for linear stochastic bandits
Yasin Abbasi-Yadkori, Dávid Pál, and Csaba Szepesvári. Improved algorithms for linear stochastic bandits. In Proceedings of the 25th International Conference on Neural Information Processing Systems, NIPS’11, page 2312–2320, Red Hook, NY , USA, 2011. Curran Associates Inc
2011
-
[4]
Schapire
Lihong Li, Wei Chu, John Langford, and Robert E. Schapire. A contextual-bandit approach to personalized news article recommendation. InProceedings of the 19th International Conference on World Wide Web, WWW ’10, page 661–670, New York, NY , USA, 2010. Association for Computing Machinery
2010
-
[5]
Liang Tang, Rómer Rosales, Ajit Singh, and Deepak K. Agarwal. Automatic ad format selection via contextual bandits.Proceedings of the 22nd ACM international conference on Information & Knowledge Management, 2013
2013
-
[6]
Jordan, and Jacob Steinhardt
Meena Jagadeesan, Alexander Wei, Yixin Wang, Michael I. Jordan, and Jacob Steinhardt. Learning equilibria in matching markets with bandit feedback.J. ACM, 70(3), May 2023
2023
-
[7]
Communication- Efficient Learning of Deep Networks from Decentralized Data
Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. Communication- Efficient Learning of Deep Networks from Decentralized Data. In Aarti Singh and Jerry Zhu, editors,Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, volume 54 ofProceedings of Machine Learning Research, pages 1273–...
2017
-
[8]
Federated optimization in heterogeneous networks
Tian Li, Anit Kumar Sahu, Manzil Zaheer, Maziar Sanjabi, Ameet Talwalkar, and Virginia Smith. Federated optimization in heterogeneous networks. In I. Dhillon, D. Papailiopoulos, and V . Sze, editors,Proceedings of Machine Learning and Systems, volume 2, pages 429–450, 2020
2020
-
[9]
A comprehensive survey on client selection strategies in federated learning.Computer Networks, 251:110663, 2024
Jian Li, Tongbao Chen, and Shaohua Teng. A comprehensive survey on client selection strategies in federated learning.Computer Networks, 251:110663, 2024
2024
-
[10]
A survey on multi-player bandits.J
Boursier Etienne and Perchet Vianney. A survey on multi-player bandits.J. Mach. Learn. Res., 25(1), January 2024
2024
-
[11]
A simple and provably efficient algorithm for asyn- chronous federated contextual linear bandits
Jiafan He, Tianhao Wang, Yifei Min, and Quanquan Gu. A simple and provably efficient algorithm for asyn- chronous federated contextual linear bandits. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, editors,Advances in Neural Information Processing Systems, volume 35, pages 4762–4775. Curran Associates, Inc., 2022
2022
-
[12]
Mathematical computation on high-dimensional data via array programming and parallel acceleration, 2025
Chen Zhang. Mathematical computation on high-dimensional data via array programming and parallel acceleration, 2025
2025
-
[13]
Traffic simulations: Multi-city calibration of metropolitan highway networks, 2025
Chao Zhang, Yechen Li, Neha Arora, Damien Pierce, and Carolina Osorio. Traffic simulations: Multi-city calibration of metropolitan highway networks, 2025
2025
-
[14]
A hybrid modelling framework for the estimation of dynamic origin–destination flows.Transportation Research Part B: Methodological, 176:102804, 2023
Sakitha Kumarage, Mehmet Yildirimoglu, and Zuduo Zheng. A hybrid modelling framework for the estimation of dynamic origin–destination flows.Transportation Research Part B: Methodological, 176:102804, 2023
2023
-
[15]
Carolina Osorio. High-dimensional offline origin-destination (od) demand calibration for stochastic traffic simulators of large-scale road networks.Transportation Research Part B: Methodological, 124:18–43, 2019
2019
-
[16]
FetchSGD: Communication-efficient federated learning with sketching
Daniel Rothchild, Ashwinee Panda, Enayat Ullah, Nikita Ivkin, Ion Stoica, Vladimir Braverman, Joseph Gonzalez, and Raman Arora. FetchSGD: Communication-efficient federated learning with sketching. In Hal Daumé III and Aarti Singh, editors,Proceedings of the 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Re...
2020
-
[17]
Sketching for distributed deep learning: A sharper analysis
Mayank Shrivastava, Berivan Isik, Qiaobo Li, Sanmi Koyejo, and Arindam Banerjee. Sketching for distributed deep learning: A sharper analysis. In A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang, editors,Advances in Neural Information Processing Systems, volume 37, pages 6417–6447. Curran Associates, Inc., 2024
2024
-
[18]
Tropp, Alp Yurtsever, Madeleine Udell, and V olkan Cevher
Joel A. Tropp, Alp Yurtsever, Madeleine Udell, and V olkan Cevher. Randomized single-view algorithms for low-rank matrix approximation.CoRR, abs/1609.00048, 2016
-
[19]
Efficient linear bandits through matrix sketching
Ilja Kuzborskij, Leonardo Cella, and Nicolò Cesa-Bianchi. Efficient linear bandits through matrix sketching. In Kamalika Chaudhuri and Masashi Sugiyama, editors,Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, volume 89 ofProceedings of Machine Learning Research, pages 177–185. PMLR, 16–18 Apr 2019. 12 A...
2019
-
[20]
Phillips, and David P
Mina Ghashami, Edo Liberty, Jeff M. Phillips, and David P. Woodruff. Frequent directions: Simple and deterministic matrix sketching.SIAM J. Comput., 45(5):1762–1792, January 2016
2016
-
[21]
Efficient and robust high-dimensional linear contextual bandits
Cheng Chen, Luo Luo, Weinan Zhang, Yong Yu, and Yijiang Lian. Efficient and robust high-dimensional linear contextual bandits. InProceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI’20, 2021
2021
-
[22]
Asynchronous upper confidence bound algorithms for federated linear bandits
Chuanhao Li and Hongning Wang. Asynchronous upper confidence bound algorithms for federated linear bandits. In Gustau Camps-Valls, Francisco J. R. Ruiz, and Isabel Valera, editors,Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, volume 151 ofProceedings of Machine Learning Research, pages 6529–6553. PMLR, 28–30 Mar 2022
2022
-
[23]
Federated Learning: Strategies for Improving Communication Efficiency
Jakub Kone ˇcný, H. Brendan McMahan, Felix X. Yu, Peter Richtárik, Ananda Theertha Suresh, and Dave Bacon. Federated learning: Strategies for improving communication efficiency.CoRR, abs/1610.05492, 2016
work page internal anchor Pith review arXiv 2016
-
[24]
Federated optimization in heterogeneous networks.arXiv: Learning, 2018
Anit Kumar Sahu, Tian Li, Maziar Sanjabi, Manzil Zaheer, Ameet Talwalkar, and Virginia Smith. Federated optimization in heterogeneous networks.arXiv: Learning, 2018
2018
-
[25]
Brinton, Vaneet Aggarwal, Huaiyu Dai, and Mung Chiang
Seyyedali Hosseinalipour, Christopher G. Brinton, Vaneet Aggarwal, Huaiyu Dai, and Mung Chiang. From feder- ated to fog learning: Distributed machine learning over heterogeneous wireless networks.IEEE Communications Magazine, 58(12):41–47, 2020
2020
-
[27]
Decentralized learning for multiplayer multiarmed bandits
Dileep Kalathil, Naumaan Nayyar, and Rahul Jain. Decentralized learning for multiplayer multiarmed bandits. IEEE Transactions on Information Theory, 60(4):2331–2345, 2014
2014
-
[28]
Decentralized randomly distributed multi-agent multi-armed bandit with heterogeneous rewards
Mengfan Xu and Diego Klabjan. Decentralized randomly distributed multi-agent multi-armed bandit with heterogeneous rewards. InProceedings of the 37th International Conference on Neural Information Processing Systems, NIPS ’23, Red Hook, NY , USA, 2023. Curran Associates Inc
2023
-
[29]
Federated multi-armed bandits.Proceedings of the AAAI Conference on Artificial Intelligence, 35(11):9603–9611, May 2021
Chengshuai Shi and Cong Shen. Federated multi-armed bandits.Proceedings of the AAAI Conference on Artificial Intelligence, 35(11):9603–9611, May 2021
2021
-
[30]
Xuchuang Wang, Lin Yang, Yu-Zhen Janice Chen, Xutong Liu, Mohammad Hajiesmaili, Don Towsley, and John C. S. Lui. Achieving near-optimal individual regret & low communications in multi-agent bandits. InThe Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net, 2023
2023
-
[31]
Federated combinatorial multi-agent multi-armed bandits
Fares Fourati, Mohamed-Slim Alouini, and Vaneet Aggarwal. Federated combinatorial multi-agent multi-armed bandits. InProceedings of the 41st International Conference on Machine Learning, ICML’24. JMLR.org, 2024
2024
-
[32]
Federated multi-armed bandits with efficient bit-level communications
Haoran Zhang, Yang Xu, Xuchuang Wang, Hao-Xu Chen, Hao Qiu, Lin Yang, and Yang Gao. Federated multi-armed bandits with efficient bit-level communications. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025
2025
-
[33]
Distributed clustering of linear bandits in peer to peer networks
Nathan Korda, Balazs Szorenyi, and Shuai Li. Distributed clustering of linear bandits in peer to peer networks. In Maria Florina Balcan and Kilian Q. Weinberger, editors,Proceedings of The 33rd International Conference on Machine Learning, volume 48 ofProceedings of Machine Learning Research, pages 1301–1309, New York, New York, USA, 20–22 Jun 2016. PMLR
2016
-
[34]
Distributed bandit learning: Near-optimal regret with efficient communication
Yuanhao Wang, Jiachen Hu, Xiaoyu Chen, and Liwei Wang. Distributed bandit learning: Near-optimal regret with efficient communication. InInternational Conference on Learning Representations, 2020
2020
-
[35]
Xutong Liu, Haoru Zhao, Tong Yu, Shuai Li, and John C.S. Lui. Federated online clustering of bandits. In James Cussens and Kun Zhang, editors,Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence, volume 180 ofProceedings of Machine Learning Research, pages 1221–1231. PMLR, 01–05 Aug 2022
2022
-
[36]
Hantao Yang, Xutong Liu, Zhiyong Wang, Hong Xie, John C. S. Lui, Defu Lian, and Enhong Chen. Federated contextual cascading bandits with asynchronous communication and heterogeneous users. InProceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Four...
2024
-
[37]
Woodruff, and Peilin Zhong
Zhao Song, David P. Woodruff, and Peilin Zhong. Low rank approximation with entrywise l1-norm error. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2017, page 688–701, New York, NY , USA, 2017. Association for Computing Machinery. 13 APREPRINT- MAY4, 2026
2017
-
[38]
Subspace embedding and linear regression with orlicz norm
Alexandr Andoni, Chengyu Lin, Ying Sheng, Peilin Zhong, and Ruiqi Zhong. Subspace embedding and linear regression with orlicz norm. In Jennifer Dy and Andreas Krause, editors,Proceedings of the 35th International Conference on Machine Learning, volume 80 ofProceedings of Machine Learning Research, pages 224–233. PMLR, 10–15 Jul 2018
2018
-
[39]
Beyond johnson-lindenstrauss: Uniform bounds for sketched bilinear forms, 2025
Rohan Deb, Qiaobo Li, Mayank Shrivastava, and Arindam Banerjee. Beyond johnson-lindenstrauss: Uniform bounds for sketched bilinear forms, 2025
2025
-
[40]
Lyu, and Irwin King
Xiaotian Yu, Michael R. Lyu, and Irwin King. Cbrap: Contextual bandits with random projection.Proceedings of the AAAI Conference on Artificial Intelligence, 31(1), Feb. 2017
2017
-
[41]
Matrix sketching in bandits: Current pitfalls and new framework, 2024
Dongxie Wen, Hanyan Yin, Xiao Zhang, and Zhewei Wei. Matrix sketching in bandits: Current pitfalls and new framework, 2024
2024
-
[42]
Sketching for first order method: efficient algorithm for low-bandwidth channel and vulnerability
Zhao Song, Yitan Wang, Zheng Yu, and Lichen Zhang. Sketching for first order method: efficient algorithm for low-bandwidth channel and vulnerability. InProceedings of the 40th International Conference on Machine Learning, ICML’23. JMLR.org, 2023. 14 APREPRINT- MAY4, 2026 A Appendix A.1 Notation in Proof According to [11], for a squenceX t = [x1, ...,x t]⊤...
2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.