{"total":13,"items":[{"citing_arxiv_id":"2607.02447","ref_index":51,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Understanding the Robustness of Distributed Self-Supervised Learning Frameworks Against Non-IID Data","primary_cat":"cs.LG","submitted_at":"2026-07-02T17:17:05+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"Abstract-only report: theoretical comparison finds MIM more robust than CL to non-IID data in D-SSL and robustness scales with connectivity; MAR loss proposed as practical application.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.10124","ref_index":46,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"FedSteer: Taming Extreme Gradient Staleness in Federated Learning with Corrective Projections and Caching","primary_cat":"cs.LG","submitted_at":"2026-06-08T19:55:45+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"FedSteer constructs a gradient subspace from cached client updates, projects active gradients to obtain coordinates, and reuses those coordinates on the drifted subspace to correct extreme staleness in federated learning.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.23131","ref_index":33,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"When Determinants Are Not Enough: Private Rare Switching","primary_cat":"cs.LG","submitted_at":"2026-05-22T01:09:14+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Replaces determinant growth with generalized Rayleigh quotient for rare switching in private linear bandits to control worst-direction volume despite non-monotonic design matrices from noise.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.18656","ref_index":8,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Statistical Limits and Efficient Algorithms for Differentially Private Federated Learning","primary_cat":"stat.ML","submitted_at":"2026-05-18T17:01:34+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Introduces FedHybrid and FedNewton for DP federated M-estimation, with finite-sample MSE bounds, minimax lower bound, and evaluations on vision datasets.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.15573","ref_index":60,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Response-Conditioned Parallel-to-Sequential Orchestration for Multi-Agent Systems","primary_cat":"cs.CL","submitted_at":"2026-05-15T03:33:20+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Nexa learns a response-conditioned policy that starts with parallel agent execution and adds at most one round of sequential message passing via a predicted sparse DAG, strictly subsuming pure parallel mode.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.10272","ref_index":22,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"DP-LAC: Lightweight Adaptive Clipping for Differentially Private Federated Fine-tuning of Language Models","primary_cat":"cs.LG","submitted_at":"2026-05-11T09:32:21+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"DP-LAC provides a new adaptive clipping technique for DP-SGD in federated LLM fine-tuning that improves accuracy by 6.6% on average without consuming additional privacy budget or requiring new hyperparameters.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"designed for central DP-SGD rather than DP-FL. 3. METHODOLOGY 3.1. Problem formulation Fine-tuning a large -language-model (LLM) typically involves up- dating model weights W by minimizing a non-convex loss F(W) for T iterations of stochastic gradient descent (SGD). In standard non-convex SGD, the expected average squared gradient norm con- verges to a finite value (often zero) [22]. Under DP-FL, the clipping bound C is a critical hyperparameter balancing privacy and utility. When DP noise z is injected, a fixed C becomes increasingly detrimental as the average gradient magnitude decays, resulting in the noise term zC to dominate the signal and potentially destabilise the loss landscape [13, 23, 17]. A line of work to ensure stable loss landscapes involves adaptively adjusting C as"},{"citing_arxiv_id":"2604.23426","ref_index":28,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Enhanced Privacy and Communication Efficiency in Non-IID Federated Learning with Adaptive Quantization and Differential Privacy","primary_cat":"cs.CV","submitted_at":"2026-04-25T19:43:05+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Adaptive bit-length schedulers plus Laplacian DP in non-IID FL reduce communicated data by up to 52.64% on MNIST and 45% on CIFAR-10 while keeping competitive accuracy and privacy.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2603.03853","ref_index":4,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Practical Quantum Federated Learning for Privacy-Sensitive Healthcare: Communication Efficiency and Noise Resilience","primary_cat":"quant-ph","submitted_at":"2026-03-04T09:04:10+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Hybrid QFL cuts quantum transmissions from 3TNMP to {3t + 2(T-t)}NMP over T rounds while preserving near-centralized convergence and improving depolarizing-noise resilience via decentralized aggregation and Steane-code QEC.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2601.00418","ref_index":21,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Secure, Verifiable, and Scalable Multi-Client Data Sharing via Consensus-Based Privacy-Preserving Data Distribution","primary_cat":"cs.CR","submitted_at":"2026-01-01T18:12:50+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"CPPDD is a new consensus-based protocol for privacy-preserving multi-client data sharing that achieves unanimous-release confidentiality, linear scalability, and high-probability malicious deviation detection.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2411.03926","ref_index":32,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Act in Collusion: Distributed Multi-Target Backdoor Attacks in Federated Learning","primary_cat":"cs.CV","submitted_at":"2024-11-06T13:57:53+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"DMBA maintains attack success rates above 80% for all backdoors in a distributed multi-target FL setting where baselines drop below 50%.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2406.10861","ref_index":105,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Knowledge Distillation in Federated Learning: a Survey on Long Lasting Challenges and New Solutions","primary_cat":"cs.LG","submitted_at":"2024-06-16T09:12:16+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":2.0,"formal_verification":"none","one_line_summary":"A survey organizing knowledge distillation techniques for addressing privacy, heterogeneity, communication, and personalization challenges in federated learning.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"matters: Rethinking data heterogeneity in federated learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . 8397-8406. [104] Gaurav Menghani and Sujith Ravi. 2019. Learning from a teacher using unlabeled data.arXiv preprint arXiv:1911.05275 (2019). J. ACM, Vol. 37, No. 4, Article 111. Publication date: August 2024. 111:32 L. Qin, et al. [105] Yuxi Mi, Yutong Mu, Shuigeng Zhou, and Jihong Guan. 2021. Fedmdr: Federated model distillation with robust aggregation. In Asia-Pacific Web (APWeb) and Web-Age Information Management (W AIM) Joint International Conference on Web and Big Data . Springer, 18-32. [106] Jed Mills, Jia Hu, and Geyong Min. 2022. Client-Side Optimization Strategies for Communication-Efficient Federated"},{"citing_arxiv_id":"2003.00295","ref_index":20,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Adaptive Federated Optimization","primary_cat":"cs.LG","submitted_at":"2020-02-29T16:37:29+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Proposes federated adaptive optimizers (FedAdagrad, FedAdam, FedYogi) with convergence analysis for non-convex objectives under data heterogeneity and reports empirical gains over FedAvg.","context_count":1,"top_context_role":"background","top_context_polarity":"unclear","context_text":"For FEDADAGRAD , we setβ1 =β2 = 0 (as typical versions of ADAGRAD do not use momentum). For FEDADAM and FEDYOGI we setβ1 = 0.9,β 2 = 0.99. While these parameters are generally 22 Published as a conference paper at ICLR 2021 Algorithm 5 FEDADAGRADFEDADAGRADFEDADAGRAD , FEDYOGIFEDYOGIFEDYOGI , and FEDADAMFEDADAMFEDADAM - Batched data Input:x0,v−1≥τ 2, optionalβ1,β 2∈ [0, 1) for FEDYOGI and FEDADAM fort = 0,··· ,T − 1 do Sample a subsetS of clients xt i =xt for each clienti∈S in parallel do fore = 1,...,E do forb∈B i do xt i =xt i−ηl∇fi(xt i;b) ∆t i =xt i−xt n = ∑ i∈Sni, ∆t = ∑ i∈S ni n ∆t i mt =β1mt−1 + (1−β1)∆t vt =vt−1 + ∆2 t (FEDADAGRAD )(FEDADAGRAD )(FEDADAGRAD ) vt =vt−1− (1−β2)∆2 t sign(vt−1− ∆2 t ) (FEDYOGI)(FEDYOGI)(FEDYOGI) vt =β2vt−1 + (1−β2)∆2 t (FEDADAM)(FEDADAM)(FEDADAM) xt+1 =xt +η mt√vt+τ good choices (Zaheer et al., 2018), we emphasize that better results may be obtainable by tuning these parameters. B.2 SCAFFOLD As discussed in Section 5, we compare all ﬁve optimizers above toSCAFFOLD (Karimireddy et al., 2019) on our various tasks. There are a few important notes about the validity of this comparison. 1. In cross-device settings, this is not a fair comparison. In particular, SCAFFOLD does not work in settings where clients cannot maintain state across rounds, as may be the case for federated learning systems on edge devices, such as cell phones. 2. SCAFFOLD has two variants described by Karimireddy et al. (2019). In Option I, the control variate of a client is updated using a full gradient computation. This effectively requires performing an extra pass over each client's dataset, as compared to Algorithm 1. In order to normalize the amount of client work, we instead use Option II, in which the clients' control variates are updated using the difference between the server model and the client's learned model. This requires the same amount of client work as FEDAVG and Algorithm 2. For practical reasons, we implement a version of SCAFFOLD mirro"},{"citing_arxiv_id":"1906.09679","ref_index":27,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"The Value of Collaboration in Convex Machine Learning with Differential Privacy","primary_cat":"cs.CR","submitted_at":"2019-06-24T00:57:15+00:00","verdict":"CONDITIONAL","verdict_confidence":"MODERATE","novelty_score":4.0,"formal_verification":"none","one_line_summary":"The fitness difference between DP and non-private convex ML models is inversely proportional to training dataset size squared and privacy budget squared.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null}],"limit":50,"offset":0}