Improving gen- eralization in federated learning with highly heterogeneous data via momentum-based stochastic controlled weight averaging

Liu, J · 2025 · arXiv 2510.27403

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

representative citing papers

Why Muon Outperforms Adam: A Curvature Perspective

cs.LG · 2026-06-03 · conditional · novelty 7.0

Muon outperforms Adam by reducing curvature penalty via lower Normalized Directional Sharpness, as shown via Taylor approximation on LLM training and proven on stylized quadratic problems with heterogeneous curvature.

FedBCD:Communication-Efficient Accelerated Block Coordinate Gradient Descent for Federated Learning

cs.LG · 2026-03-05 · unverdicted · novelty 7.0

FedBCGD reduces communication in federated learning by a factor of 1/N through block-wise parameter updates with accelerated convergence guarantees.

DP-FedAdamW: An Efficient Optimizer for Differentially Private Federated Large Models

cs.LG · 2026-02-23 · unverdicted · novelty 7.0

DP-FedAdamW delivers an unbiased second-moment estimator for AdamW in DPFL, proving linear convergence acceleration without heterogeneity assumptions and outperforming SOTA by 5.83% on Tiny-ImageNet with Swin-Base at ε=1.

SUDA-Muon: Structural Design Principles and Boundaries for Fully Decentralized Muon

math.OC · 2026-04-27 · unverdicted · novelty 6.0

SUDA-Muon modularizes decentralized Muon via the SUDA template, proving a topology-separated convergence rate of O((1+σ/√N)K^{-1/4}) in nuclear-norm geometry while establishing that tracking-before-polarization is required to avoid non-stationary fixed points and that local-polarize-then-average is

Rethinking LoRA for Privacy-Preserving Federated Learning in Large Models

cs.LG · 2026-02-23 · unverdicted · novelty 6.0

LA-LoRA decouples LoRA matrix updates in DPFL settings to improve robustness to privacy noise, delivering up to 16.83% higher accuracy than prior LoRA variants on Swin-B under strict epsilon=1.

Subspace Optimization for Efficient Federated Learning under Heterogeneous Data

cs.LG · 2026-04-28 · unverdicted · novelty 5.0

SSF enables efficient federated learning under heterogeneous data by optimizing in a low-dimensional subspace with projected corrections and backfill updates, achieving a non-asymptotic convergence rate of order O~(1/T + 1/sqrt(NKT)).

A Note on Stability for Orthogonalized Matrix Momentum with Client Sampling

cs.LG · 2026-06-01 · unverdicted · novelty 4.0

Derives finite-round upper-tail guarantee on population-empirical gap for client-sampled orthogonalized matrix momentum under heterogeneous data, with Lipschitz condition on the orthogonalizer.

FedSLoP: Memory-Efficient Federated Learning with Low-Rank Gradient Projection

cs.LG · 2026-04-27 · unverdicted · novelty 4.0 · 2 refs

FedSLoP applies stochastic low-rank gradient projections in federated learning to reduce communication volume and client memory while proving O(1/sqrt(NT)) convergence to stationary points under standard assumptions and showing competitive accuracy on heterogeneous MNIST.

FedNSAM:Consistency of Local and Global Flatness for Federated Learning

cs.LG · 2026-02-27 · unverdicted · novelty 4.0

FedNSAM uses global Nesterov momentum to make local flatness consistent with global flatness in federated learning, yielding tighter convergence than FedSAM and better empirical performance.

citing papers explorer

Showing 0 of 0 citing papers after filters.

No citing papers match the current filters.

Improving gen- eralization in federated learning with highly heterogeneous data via momentum-based stochastic controlled weight averaging

fields

years

verdicts

representative citing papers

citing papers explorer