pith. sign in

Improving gen- eralization in federated learning with highly heterogeneous data via momentum-based stochastic controlled weight averaging

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

years

2026 9

clear filters

representative citing papers

Why Muon Outperforms Adam: A Curvature Perspective

cs.LG · 2026-06-03 · conditional · novelty 7.0

Muon outperforms Adam by reducing curvature penalty via lower Normalized Directional Sharpness, as shown via Taylor approximation on LLM training and proven on stylized quadratic problems with heterogeneous curvature.

SUDA-Muon: Structural Design Principles and Boundaries for Fully Decentralized Muon

math.OC · 2026-04-27 · unverdicted · novelty 6.0

SUDA-Muon modularizes decentralized Muon via the SUDA template, proving a topology-separated convergence rate of O((1+σ/√N)K^{-1/4}) in nuclear-norm geometry while establishing that tracking-before-polarization is required to avoid non-stationary fixed points and that local-polarize-then-average is

FedSLoP: Memory-Efficient Federated Learning with Low-Rank Gradient Projection

cs.LG · 2026-04-27 · unverdicted · novelty 4.0 · 2 refs

FedSLoP applies stochastic low-rank gradient projections in federated learning to reduce communication volume and client memory while proving O(1/sqrt(NT)) convergence to stationary points under standard assumptions and showing competitive accuracy on heterogeneous MNIST.

citing papers explorer

Showing 9 of 9 citing papers after filters.