FedFrozen improves stability in heterogeneous federated Transformer training by warming up the full model then freezing the attention kernel (query/key) while optimizing the value block under a fixed kernel.
11 Appendix In the appendix, we provide the mathematical proofs for all theorems presented in the paper, fol- lowed by additional experimental results
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
FedFrozen: Two-Stage Federated Optimization via Attention Kernel Freezing
FedFrozen improves stability in heterogeneous federated Transformer training by warming up the full model then freezing the attention kernel (query/key) while optimizing the value block under a fixed kernel.