arxiv: 2605.00500 · v1 · submitted 2026-05-01 · 💻 cs.LG

Recognition: unknown

Scaling Federated Linear Contextual Bandits via Sketching

Hantao Yang , Hong Xie , Xutong Liu , Defu Lian

Authors on Pith no claims yet

Pith reviewed 2026-05-09 20:14 UTC · model grok-4.3

classification 💻 cs.LG

keywords federated learningcontextual banditssketchingregret boundscommunication efficiencycomputational efficiencylinear bandits

0 comments

The pith

Federated sketching reduces high-dimensional bandit costs while matching optimal regret.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes a sketching method to make federated contextual linear bandits practical for high-dimensional data. It cuts local computation time from cubic to quadratic in the sketch size and communication volume from quadratic to linear in the sketch size. The method maintains the asynchronous communication protocol by using the sketch matrix to decide when to communicate. This matters because without it, federated bandit systems cannot handle realistic data dimensions without excessive resource use. When the sketch size exceeds the data rank, performance matches the best known bounds without sketching.

Core claim

FSCLB uses SVD to compute the required determinant indirectly via the sketch, achieving O(l^2 d) time instead of O(d^3), and employs a double-sketch to reduce uploads and downloads to O(l d). By substituting the sketch for the covariance matrix in the communication decision, it preserves the local increment property and asynchronous condition. The resulting regret is O~((sqrt(d) + sqrt(M ε_l)) sqrt(l T)), which equals the optimal O~(sqrt(l d T)) when l is large enough to capture the full rank of the covariance.

What carries the argument

Double-sketch strategy with SVD-based determinant and sketch substitution for communication decisions in FSCLB.

Load-bearing premise

Substituting the sketch matrix for the covariance matrix when deciding communication preserves both the asynchronous condition and the validity of the regret analysis without introducing uncontrolled bias.

What would settle it

A test where l is set below the known rank of the covariance on full-rank data, checking if the observed regret exceeds the claimed bound or if the asynchronous property breaks.

Figures

Figures reproduced from arXiv: 2605.00500 by Defu Lian, Hantao Yang, Hong Xie, Xutong Liu.

**Figure 1.** Figure 1: Reward in simulation dataset 0K 10K 20K timestep 0 200 400 600 runnung time (s) FSCLB(Ours) FedlinUCB (a) 50 dimension 0K 10K 20K timestep 0K 2K 4K runnung time (s) FSCLB(Ours) FedlinUCB (b) 100 dimension [PITH_FULL_IMAGE:figures/full_fig_p009_1.png] view at source ↗

**Figure 2.** Figure 2: Computation cost in simulation dataset 0K 10K 20K timestep 0 40 80 120 160 c o m m u n i c a t i o n ( × 1 0 4 ) FSCLB(Ours) FedlinUCB (a) 50 dimension 0K 10K 20K timestep 0 40 80 c o m m u n i c a t i o n ( × 1 0 5 ) FSCLB(Ours) FedlinUCB (b) 100 dimension [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗

**Figure 3.** Figure 3: Communication in simulation dataset 9 [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗

**Figure 4.** Figure 4: Reward in real dataset 0K 10K 20K timestep 0 100 200 300 runnung time (s) FSCLB(Ours) FedlinUCB (a) satimage 0K 10K 20K timestep 0 200 400 600 runnung time (s) FSCLB(Ours) FedlinUCB (b) mfeat [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗

**Figure 5.** Figure 5: Computation cost in real dataset 0K 10K 20K timestep 0 20 40 60 80 c o m m u n i c a t i o n ( × 1 0 4 ) FSCLB(Ours) FedlinUCB (a) satimage 0K 10K 20K timestep 0 40 80 120 160 c o m m u n i c a t i o n ( × 1 0 4 ) FSCLB(Ours) FedlinUCB (b) mfeat [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗

**Figure 6.** Figure 6: Communication in real dataset 7 Conclusion In this work we consider the computational and communication cost in federated contextual linear bandits. We involve sketching into this setting and propose the FSCLB algorithm, which cuts local computation and communication. FSCLB employs a double-sketch strategy to guarantee low cost in both upload and download. By adopting SCFD [21] and 10 [PITH_FULL_IMAGE:fig… view at source ↗

read the original abstract

In federated contextual linear bandits, high data dimensionality incurs prohibitive computation and communication costs: local agents perform $O(d^3)$-time determinant computation and upload $O(d^2)$ parameters, making existing algorithms unscalable, where $d$ is the dimension of data. To relieve these scaling bottlenecks, this paper proposes Federated Sketch Contextual Linear Bandits (FSCLB). On the computation side, FSCLB uses SVD to indirectly obtain the determinant required for communication, eliminating the prohibitive cost of direct determinant calculation and cutting complexity from $O(d^3)$ to $O(l^2d)$ per round, where $l< d$ is the sketch size. On the communication side, FSCLB introduces a double-sketch strategy that reduces both upload and download costs from $O(d^2)$ to $O(ld)$. Naively involving sketch update into federated contextual linear bandits can destroy the local increment and invalidate the asynchronous communication condition; FSCLB solves this by replacing the covariance matrix with the sketch matrix when deciding whether to communicate. Theoretically, FSCLB achieves a regret bound of $\widetilde{O} ((\sqrt{d}+\sqrt{M\varepsilon_l})\sqrt{lT})$, where $\varepsilon_l$ is the upper bounded by the spectral tail of the covariance matrix; when $l$ exceeds the rank of the covariance matrix, the bound simplifies to $\widetilde{O}(\sqrt{ldT})$, matching the optimal no-sketch regret. Experiments on both synthetic and real-world datasets show that FSCLB significantly reduces computational and communication costs by over 90 \% while sacrificing only a negligible amount of cumulative reward.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

FSCLB shows a workable sketching route to cut computation and communication in federated linear bandits, but the substitution step for async decisions needs a close look in the proofs.

read the letter

The main point is that this paper gives a concrete way to scale federated linear contextual bandits when dimension d is large. They replace direct determinant calculations with an SVD-based shortcut and use double sketching to drop both upload and download costs from O(d^2) to O(ld). The algorithm they call FSCLB also swaps the sketch matrix in for the covariance matrix when deciding whether to communicate, which they claim keeps the asynchronous condition intact while adding only a spectral-tail error term. When the sketch size l exceeds the rank, the regret bound simplifies to match the no-sketch optimum up to constants. That combination of tricks is the actual novelty here; prior sketching work in bandits did not have to handle the federated async constraint in exactly this way. Empirically they report over 90 percent drops in compute and communication on both synthetic and real datasets with only negligible reward loss, which is the part that would interest practitioners running on edge devices. The theory looks standard linear-bandit analysis plus an additive sketch-error term, and the circularity burden is low since the bound is not self-referential. The soft spot is the substitution rule itself. The abstract states that naive sketching breaks local increments and the async condition, so the fix is to use the sketch for the decision; for the claimed bound to go through, that replacement must not change communication frequency or inject bias beyond the ε_l term already in the expression. The stress-test note flags this correctly, and without a proof sketch or explicit verification that the modified decision rule preserves the original analysis conditions, it is hard to know how tight the O~((sqrt(d) + sqrt(M ε_l)) sqrt(l T)) result actually is. Experiments are encouraging but the abstract gives no error bars or full baseline tables, so the “negligible” reward claim is still provisional. This paper is for people working on distributed online learning who need to push beyond low-dimensional regimes. A reader focused on practical federated bandits would get usable algorithmic ideas and cost numbers from it. It is coherent enough and addresses a real barrier, so it deserves a serious referee. I would send it out for review and ask specifically for the full derivation of the regret under the sketch-substitution rule plus more detailed experimental controls.

Referee Report

3 major / 2 minor

Summary. The paper proposes Federated Sketch Contextual Linear Bandits (FSCLB) to scale federated linear contextual bandits to high dimensions. It uses SVD-based sketching to reduce per-round determinant computation from O(d^3) to O(l^2 d) and a double-sketch strategy to cut communication from O(d^2) to O(l d). To preserve the asynchronous communication condition that naive sketching would destroy, FSCLB substitutes the sketch matrix for the covariance matrix when deciding whether to communicate. The central theoretical claim is a regret bound of O~((√d + √(M ε_l)) √(l T)), where ε_l bounds the spectral tail of the covariance; when l exceeds the matrix rank the bound simplifies to O~(√(l d T)), matching the optimal no-sketch regret. Experiments on synthetic and real data report >90% reductions in compute and communication cost with negligible reward loss.

Significance. If the regret bound is rigorously shown to hold after the sketch substitution and the empirical gains are reproducible with proper baselines and statistics, the work would be significant for practical high-dimensional federated bandits. The double-sketch approach and the explicit fix for the asynchronous condition are concrete contributions; the parameter-free simplification of the bound when l is large enough would be a notable strength if the derivation is complete.

major comments (3)

[§4, Theorem 1] §4 (main regret theorem): the bound O~((√d + √(M ε_l)) √(l T)) is derived from standard linear-bandit analysis plus an additive sketch-error term, yet the manuscript provides no explicit steps showing that substituting the sketch matrix for the covariance matrix in the communication decision preserves the exact timing and validity of local increments required by the asynchronous analysis. Without this verification, the claim that the bound reduces to the optimal no-sketch rate when l exceeds rank cannot be confirmed.
[§3.2] §3.2 (algorithm description): the text states that naive sketching 'destroys the local increment and invalidates the asynchronous communication condition,' then asserts the replacement solves it, but does not quantify any additional bias or change in communication frequency introduced by the modified decision rule. This substitution is load-bearing for the subsequent regret analysis.
[Experiments] Experimental section: the reported >90% cost reductions and 'negligible' reward sacrifice are presented without baseline comparisons to the original (non-sketch) federated algorithm, without error bars or statistical tests on cumulative reward, and without details on the number of independent runs, leaving the empirical support for the central scaling claim incomplete.

minor comments (2)

[Abstract and §4] Notation for M (number of agents) and ε_l (spectral tail) is introduced in the abstract and theorem but not defined with a precise equation reference in the main text.
[Figures 1-3] Figure captions and table headers should explicitly state the sketch size l used in each plotted curve to allow direct comparison with the theoretical regime l > rank.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive and detailed comments. We address each major point below and will revise the manuscript to incorporate clarifications and additional material where appropriate.

read point-by-point responses

Referee: [§4, Theorem 1] §4 (main regret theorem): the bound O~((√d + √(M ε_l)) √(l T)) is derived from standard linear-bandit analysis plus an additive sketch-error term, yet the manuscript provides no explicit steps showing that substituting the sketch matrix for the covariance matrix in the communication decision preserves the exact timing and validity of local increments required by the asynchronous analysis. Without this verification, the claim that the bound reduces to the optimal no-sketch rate when l exceeds rank cannot be confirmed.

Authors: We agree that the proof presentation would benefit from greater explicitness. The substitution replaces the covariance matrix with its sketch only in the communication trigger while retaining the original local update rule for the covariance itself; because the sketch error is controlled by the spectral tail ε_l (already present in the regret statement), the timing of communication rounds remains within the same asymptotic regime as the unsketched analysis. In the revision we will insert a short supporting lemma (placed in the appendix and referenced from §4) that formally verifies preservation of the local-increment and asynchrony conditions under this substitution, thereby confirming that the bound collapses to the optimal Õ(√(l d T)) rate once l exceeds the covariance rank. revision: yes
Referee: [§3.2] §3.2 (algorithm description): the text states that naive sketching 'destroys the local increment and invalidates the asynchronous communication condition,' then asserts the replacement solves it, but does not quantify any additional bias or change in communication frequency introduced by the modified decision rule. This substitution is load-bearing for the subsequent regret analysis.

Authors: The sketch-based trigger introduces an additive perturbation bounded by ε_l in the determinant test; this perturbation affects only the decision threshold and does not alter the underlying local covariance increments. We will expand §3.2 with a short paragraph that bounds the resulting difference in communication frequency (at most O(√(T) ε_l) extra rounds in expectation) and shows that any extra bias is absorbed into the existing additive sketch-error term of the regret bound. revision: yes
Referee: [Experiments] Experimental section: the reported >90% cost reductions and 'negligible' reward sacrifice are presented without baseline comparisons to the original (non-sketch) federated algorithm, without error bars or statistical tests on cumulative reward, and without details on the number of independent runs, leaving the empirical support for the central scaling claim incomplete.

Authors: We accept that the experimental section requires strengthening. In the revision we will (i) add direct runtime and communication comparisons against the original non-sketch federated linear contextual bandit algorithm, (ii) report means and standard deviations over at least five independent random seeds with error bars on all cumulative-reward plots, and (iii) include paired t-tests (or Wilcoxon tests) to quantify statistical significance of the observed cost savings and reward differences. revision: yes

Circularity Check

0 steps flagged

Regret bound derived from standard linear-bandit analysis plus explicit sketch-error term

full rationale

The paper states a regret bound of O~((√d + √(M ε_l)) √(l T)) that reduces to O~(√(l d T)) when l exceeds rank because ε_l vanishes. This follows from incorporating the spectral-tail approximation error of the sketch into the usual analysis of linear contextual bandits under asynchronous communication; the bound does not reduce to any fitted parameter, self-referential definition, or self-citation chain. The double-sketch strategy and covariance-to-sketch substitution are presented as algorithmic fixes whose error is explicitly bounded by ε_l rather than assumed away. No load-bearing step equates the claimed result to its inputs by construction.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The central claim rests on standard matrix sketching approximation guarantees and classical linear-bandit regret analysis; the only notable free parameter is the sketch dimension l that trades off error epsilon_l against cost.

free parameters (1)

sketch size l
User-chosen parameter that controls both computational savings and the spectral-tail error epsilon_l appearing in the regret bound.

axioms (2)

domain assumption The covariance matrix admits a low-rank or rapidly decaying spectrum so that a sketch of size l yields small spectral tail epsilon_l.
Invoked to obtain the simplified O(sqrt(l d T)) regret when l exceeds matrix rank.
standard math Standard linear contextual bandit assumptions (bounded contexts, sub-Gaussian noise, etc.) continue to hold after sketching.
Required for the base regret analysis to carry over.

pith-pipeline@v0.9.0 · 5597 in / 1560 out tokens · 65971 ms · 2026-05-09T20:14:31.953255+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

41 extracted references · 2 canonical work pages · 1 internal anchor

[1]

Using confidence bounds for exploitation-exploration trade-offs.J

Peter Auer. Using confidence bounds for exploitation-exploration trade-offs.J. Mach. Learn. Res., 3(null):397–422, March 2003

2003
[2]

Cambridge University Press, 2020

Tor Lattimore and Csaba Szepesvári.Bandit algorithms. Cambridge University Press, 2020

2020
[3]

Improved algorithms for linear stochastic bandits

Yasin Abbasi-Yadkori, Dávid Pál, and Csaba Szepesvári. Improved algorithms for linear stochastic bandits. In Proceedings of the 25th International Conference on Neural Information Processing Systems, NIPS’11, page 2312–2320, Red Hook, NY , USA, 2011. Curran Associates Inc

2011
[4]

Schapire

Lihong Li, Wei Chu, John Langford, and Robert E. Schapire. A contextual-bandit approach to personalized news article recommendation. InProceedings of the 19th International Conference on World Wide Web, WWW ’10, page 661–670, New York, NY , USA, 2010. Association for Computing Machinery

2010
[5]

Liang Tang, Rómer Rosales, Ajit Singh, and Deepak K. Agarwal. Automatic ad format selection via contextual bandits.Proceedings of the 22nd ACM international conference on Information & Knowledge Management, 2013

2013
[6]

Jordan, and Jacob Steinhardt

Meena Jagadeesan, Alexander Wei, Yixin Wang, Michael I. Jordan, and Jacob Steinhardt. Learning equilibria in matching markets with bandit feedback.J. ACM, 70(3), May 2023

2023
[7]

Communication- Efficient Learning of Deep Networks from Decentralized Data

Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. Communication- Efficient Learning of Deep Networks from Decentralized Data. In Aarti Singh and Jerry Zhu, editors,Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, volume 54 ofProceedings of Machine Learning Research, pages 1273–...

2017
[8]

Federated optimization in heterogeneous networks

Tian Li, Anit Kumar Sahu, Manzil Zaheer, Maziar Sanjabi, Ameet Talwalkar, and Virginia Smith. Federated optimization in heterogeneous networks. In I. Dhillon, D. Papailiopoulos, and V . Sze, editors,Proceedings of Machine Learning and Systems, volume 2, pages 429–450, 2020

2020
[9]

A comprehensive survey on client selection strategies in federated learning.Computer Networks, 251:110663, 2024

Jian Li, Tongbao Chen, and Shaohua Teng. A comprehensive survey on client selection strategies in federated learning.Computer Networks, 251:110663, 2024

2024
[10]

A survey on multi-player bandits.J

Boursier Etienne and Perchet Vianney. A survey on multi-player bandits.J. Mach. Learn. Res., 25(1), January 2024

2024
[11]

A simple and provably efficient algorithm for asyn- chronous federated contextual linear bandits

Jiafan He, Tianhao Wang, Yifei Min, and Quanquan Gu. A simple and provably efficient algorithm for asyn- chronous federated contextual linear bandits. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, editors,Advances in Neural Information Processing Systems, volume 35, pages 4762–4775. Curran Associates, Inc., 2022

2022
[12]

Mathematical computation on high-dimensional data via array programming and parallel acceleration, 2025

Chen Zhang. Mathematical computation on high-dimensional data via array programming and parallel acceleration, 2025

2025
[13]

Traffic simulations: Multi-city calibration of metropolitan highway networks, 2025

Chao Zhang, Yechen Li, Neha Arora, Damien Pierce, and Carolina Osorio. Traffic simulations: Multi-city calibration of metropolitan highway networks, 2025

2025
[14]

A hybrid modelling framework for the estimation of dynamic origin–destination flows.Transportation Research Part B: Methodological, 176:102804, 2023

Sakitha Kumarage, Mehmet Yildirimoglu, and Zuduo Zheng. A hybrid modelling framework for the estimation of dynamic origin–destination flows.Transportation Research Part B: Methodological, 176:102804, 2023

2023
[15]

Carolina Osorio. High-dimensional offline origin-destination (od) demand calibration for stochastic traffic simulators of large-scale road networks.Transportation Research Part B: Methodological, 124:18–43, 2019

2019
[16]

FetchSGD: Communication-efficient federated learning with sketching

Daniel Rothchild, Ashwinee Panda, Enayat Ullah, Nikita Ivkin, Ion Stoica, Vladimir Braverman, Joseph Gonzalez, and Raman Arora. FetchSGD: Communication-efficient federated learning with sketching. In Hal Daumé III and Aarti Singh, editors,Proceedings of the 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Re...

2020
[17]

Sketching for distributed deep learning: A sharper analysis

Mayank Shrivastava, Berivan Isik, Qiaobo Li, Sanmi Koyejo, and Arindam Banerjee. Sketching for distributed deep learning: A sharper analysis. In A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang, editors,Advances in Neural Information Processing Systems, volume 37, pages 6417–6447. Curran Associates, Inc., 2024

2024
[18]

Tropp, Alp Yurtsever, Madeleine Udell, and V olkan Cevher

Joel A. Tropp, Alp Yurtsever, Madeleine Udell, and V olkan Cevher. Randomized single-view algorithms for low-rank matrix approximation.CoRR, abs/1609.00048, 2016

work page arXiv 2016
[19]

Efficient linear bandits through matrix sketching

Ilja Kuzborskij, Leonardo Cella, and Nicolò Cesa-Bianchi. Efficient linear bandits through matrix sketching. In Kamalika Chaudhuri and Masashi Sugiyama, editors,Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, volume 89 ofProceedings of Machine Learning Research, pages 177–185. PMLR, 16–18 Apr 2019. 12 A...

2019
[20]

Phillips, and David P

Mina Ghashami, Edo Liberty, Jeff M. Phillips, and David P. Woodruff. Frequent directions: Simple and deterministic matrix sketching.SIAM J. Comput., 45(5):1762–1792, January 2016

2016
[21]

Efficient and robust high-dimensional linear contextual bandits

Cheng Chen, Luo Luo, Weinan Zhang, Yong Yu, and Yijiang Lian. Efficient and robust high-dimensional linear contextual bandits. InProceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI’20, 2021

2021
[22]

Asynchronous upper confidence bound algorithms for federated linear bandits

Chuanhao Li and Hongning Wang. Asynchronous upper confidence bound algorithms for federated linear bandits. In Gustau Camps-Valls, Francisco J. R. Ruiz, and Isabel Valera, editors,Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, volume 151 ofProceedings of Machine Learning Research, pages 6529–6553. PMLR, 28–30 Mar 2022

2022
[23]

Federated Learning: Strategies for Improving Communication Efficiency

Jakub Kone ˇcný, H. Brendan McMahan, Felix X. Yu, Peter Richtárik, Ananda Theertha Suresh, and Dave Bacon. Federated learning: Strategies for improving communication efficiency.CoRR, abs/1610.05492, 2016

work page internal anchor Pith review arXiv 2016
[24]

Federated optimization in heterogeneous networks.arXiv: Learning, 2018

Anit Kumar Sahu, Tian Li, Maziar Sanjabi, Manzil Zaheer, Ameet Talwalkar, and Virginia Smith. Federated optimization in heterogeneous networks.arXiv: Learning, 2018

2018
[25]

Brinton, Vaneet Aggarwal, Huaiyu Dai, and Mung Chiang

Seyyedali Hosseinalipour, Christopher G. Brinton, Vaneet Aggarwal, Huaiyu Dai, and Mung Chiang. From feder- ated to fog learning: Distributed machine learning over heterogeneous wireless networks.IEEE Communications Magazine, 58(12):41–47, 2020

2020
[27]

Decentralized learning for multiplayer multiarmed bandits

Dileep Kalathil, Naumaan Nayyar, and Rahul Jain. Decentralized learning for multiplayer multiarmed bandits. IEEE Transactions on Information Theory, 60(4):2331–2345, 2014

2014
[28]

Decentralized randomly distributed multi-agent multi-armed bandit with heterogeneous rewards

Mengfan Xu and Diego Klabjan. Decentralized randomly distributed multi-agent multi-armed bandit with heterogeneous rewards. InProceedings of the 37th International Conference on Neural Information Processing Systems, NIPS ’23, Red Hook, NY , USA, 2023. Curran Associates Inc

2023
[29]

Federated multi-armed bandits.Proceedings of the AAAI Conference on Artificial Intelligence, 35(11):9603–9611, May 2021

Chengshuai Shi and Cong Shen. Federated multi-armed bandits.Proceedings of the AAAI Conference on Artificial Intelligence, 35(11):9603–9611, May 2021

2021
[30]

Xuchuang Wang, Lin Yang, Yu-Zhen Janice Chen, Xutong Liu, Mohammad Hajiesmaili, Don Towsley, and John C. S. Lui. Achieving near-optimal individual regret & low communications in multi-agent bandits. InThe Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net, 2023

2023
[31]

Federated combinatorial multi-agent multi-armed bandits

Fares Fourati, Mohamed-Slim Alouini, and Vaneet Aggarwal. Federated combinatorial multi-agent multi-armed bandits. InProceedings of the 41st International Conference on Machine Learning, ICML’24. JMLR.org, 2024

2024
[32]

Federated multi-armed bandits with efficient bit-level communications

Haoran Zhang, Yang Xu, Xuchuang Wang, Hao-Xu Chen, Hao Qiu, Lin Yang, and Yang Gao. Federated multi-armed bandits with efficient bit-level communications. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025

2025
[33]

Distributed clustering of linear bandits in peer to peer networks

Nathan Korda, Balazs Szorenyi, and Shuai Li. Distributed clustering of linear bandits in peer to peer networks. In Maria Florina Balcan and Kilian Q. Weinberger, editors,Proceedings of The 33rd International Conference on Machine Learning, volume 48 ofProceedings of Machine Learning Research, pages 1301–1309, New York, New York, USA, 20–22 Jun 2016. PMLR

2016
[34]

Distributed bandit learning: Near-optimal regret with efficient communication

Yuanhao Wang, Jiachen Hu, Xiaoyu Chen, and Liwei Wang. Distributed bandit learning: Near-optimal regret with efficient communication. InInternational Conference on Learning Representations, 2020

2020
[35]

Xutong Liu, Haoru Zhao, Tong Yu, Shuai Li, and John C.S. Lui. Federated online clustering of bandits. In James Cussens and Kun Zhang, editors,Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence, volume 180 ofProceedings of Machine Learning Research, pages 1221–1231. PMLR, 01–05 Aug 2022

2022
[36]

Hantao Yang, Xutong Liu, Zhiyong Wang, Hong Xie, John C. S. Lui, Defu Lian, and Enhong Chen. Federated contextual cascading bandits with asynchronous communication and heterogeneous users. InProceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Four...

2024
[37]

Woodruff, and Peilin Zhong

Zhao Song, David P. Woodruff, and Peilin Zhong. Low rank approximation with entrywise l1-norm error. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2017, page 688–701, New York, NY , USA, 2017. Association for Computing Machinery. 13 APREPRINT- MAY4, 2026

2017
[38]

Subspace embedding and linear regression with orlicz norm

Alexandr Andoni, Chengyu Lin, Ying Sheng, Peilin Zhong, and Ruiqi Zhong. Subspace embedding and linear regression with orlicz norm. In Jennifer Dy and Andreas Krause, editors,Proceedings of the 35th International Conference on Machine Learning, volume 80 ofProceedings of Machine Learning Research, pages 224–233. PMLR, 10–15 Jul 2018

2018
[39]

Beyond johnson-lindenstrauss: Uniform bounds for sketched bilinear forms, 2025

Rohan Deb, Qiaobo Li, Mayank Shrivastava, and Arindam Banerjee. Beyond johnson-lindenstrauss: Uniform bounds for sketched bilinear forms, 2025

2025
[40]

Lyu, and Irwin King

Xiaotian Yu, Michael R. Lyu, and Irwin King. Cbrap: Contextual bandits with random projection.Proceedings of the AAAI Conference on Artificial Intelligence, 31(1), Feb. 2017

2017
[41]

Matrix sketching in bandits: Current pitfalls and new framework, 2024

Dongxie Wen, Hanyan Yin, Xiao Zhang, and Zhewei Wei. Matrix sketching in bandits: Current pitfalls and new framework, 2024

2024
[42]

Sketching for first order method: efficient algorithm for low-bandwidth channel and vulnerability

Zhao Song, Yitan Wang, Zheng Yu, and Lichen Zhang. Sketching for first order method: efficient algorithm for low-bandwidth channel and vulnerability. InProceedings of the 40th International Conference on Machine Learning, ICML’23. JMLR.org, 2023. 14 APREPRINT- MAY4, 2026 A Appendix A.1 Notation in Proof According to [11], for a squenceX t = [x1, ...,x t]⊤...

2023