pith. sign in

hub

Deep gradient compression: Reducing the communication bandwidth for distributed training

17 Pith papers cite this work. Polarity classification is still indexing.

17 Pith papers citing it

hub tools

citation-role summary

background 2 method 2

citation-polarity summary

representative citing papers

SignMuon: Communication-Efficient Distributed Muon Optimization

cs.LG · 2026-05-04 · unverdicted · novelty 6.0

SignMuon merges majority-vote sign aggregation from signSGD with Muon's polar-factor steps to create a communication-efficient distributed optimizer that matches signSGD rates under symmetric noise and shows strong empirical results on CIFAR and nanoGPT.

FedSQ: Optimized Weight Averaging via Fixed Gating

cs.LG · 2026-04-03 · unverdicted · novelty 6.0

FedSQ stabilizes federated weight averaging under heterogeneous data by fixing binary gating masks derived from a pretrained model's structure while optimizing only quantitative parameters.

Federated Learning with Non-IID Data

cs.LG · 2018-06-02 · conditional · novelty 6.0

Non-IID data causes up to 55% accuracy loss in federated learning due to weight divergence measured by earth mover's distance; 5% globally shared data recovers 30% accuracy on CIFAR-10.

citing papers explorer

Showing 17 of 17 citing papers.