arxiv: 2605.05959 · v1 · submitted 2026-05-07 · 💻 cs.AI · cs.DC· cs.LG

Recognition: unknown

From Coordinate Matching to Structural Alignment: Rethinking Prototype Alignment in Heterogeneous Federated Learning

Xinghao Wu , Jianwei Niu , Guogang Zhu , Xuefeng Liu , Shaojie Tang , Jiayuan Zhang

Authors on Pith no claims yet

Pith reviewed 2026-05-08 10:51 UTC · model grok-4.3

classification 💻 cs.AI cs.DCcs.LG

keywords heterogeneous federated learningprototype alignmentstructural alignmentcoordinate alignmentFedSAFmodel heterogeneity

0 comments

The pith

In heterogeneous federated learning, prototype alignment succeeds when it matches inter-class relations rather than exact coordinates.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that existing prototype-based approaches to heterogeneous federated learning reuse coordinate-wise matching, which forces every client to optimize inside the same global feature subspace. This works when all models share the same architecture but becomes counterproductive once clients use different feature extractors, because the shared subspace suppresses each client's private capacity. The authors separate the useful goal of preserving semantic relations between classes from the unnecessary goal of enforcing a common coordinate basis. Their method therefore aligns only the relative structure among prototypes, letting each client retain its own representation space while still exchanging class-level semantics. Experiments on standard benchmarks confirm that this change produces higher accuracy than prior prototype methods.

Core claim

Coordinate alignment couples two objectives that should be separate: matching inter-class semantic structure, which aids classification, and forcing a shared feature basis, which is harmful under model heterogeneity. Structural alignment removes the second objective by matching relational properties such as inter-class similarities or distances instead of absolute positions, allowing each client's feature extractor to remain distinct while still benefiting from global class relations.

What carries the argument

Structural alignment objective that matches inter-class relational structure across clients instead of absolute coordinate positions in the embedding space.

Load-bearing premise

Inter-class relational structure can be aligned across clients without any shared coordinate basis and that doing so is always more useful than coordinate matching when feature extractors differ.

What would settle it

An experiment on the same heterogeneous benchmarks where structural alignment produces equal or lower accuracy than coordinate alignment.

Figures

Figures reproduced from arXiv: 2605.05959 by Guogang Zhu, Jianwei Niu, Jiayuan Zhang, Shaojie Tang, Xinghao Wu, Xuefeng Liu.

**Figure 1.** Figure 1: Comparison of effective dimensionality in homogeneous and het view at source ↗

**Figure 3.** Figure 3: Comprehensive comparison between coordinate alignment (MSE, view at source ↗

**Figure 4.** Figure 4: Overview of FedSAF. (1) Federated Pipeline: The server constructs global class prototypes Pg by aggregating uploaded local prototypes and subsequently broadcasts Pg to heterogeneous clients. (2) Local Training on Client i: For a given mini-batch, the client extracts representations z = fi(x) and forms batch-wise local prototypes Pi. A structure operator S(·) is applied to compute the semantic structures of… view at source ↗

**Figure 5.** Figure 5: Accuracy improvement (%) over the no-alignment baseline ( view at source ↗

**Figure 6.** Figure 6: Per-client test accuracy on CIFAR-10 and CIFAR-100 under Dir(0.1) view at source ↗

**Figure 7.** Figure 7: t-SNE visualization of local prototypes under MSE (coordinate view at source ↗

**Figure 8.** Figure 8: Test accuracy (%) of existing methods when combining with GCSA. view at source ↗

**Figure 9.** Figure 9: The test accuracy curve of different methods on CIFAR-10 and view at source ↗

**Figure 10.** Figure 10: Effect of local epoch E on different methods. since less frequent communication reduces the effectiveness of global knowledge exchange. (2) Our structural alignment methods (GCSA and RCSA) remain robust across all settings and consistently outperform other HtFL methods under all local epoch configurations. V. DISCUSSION A. Relationship with Contrastive Alignment Many prototype-based methods adopt contrast… view at source ↗

read the original abstract

Heterogeneous federated learning (HtFL) aims to enable collaboration among clients that differ in both data distributions and model architectures. Prototype-based methods, which communicate class-level feature centers (prototypes) instead of full model parameters, have recently shown strong potential for HtFL. Existing prototype-based HtFL methods typically reuse the MSE-based or cosine-based alignment mechanism developed for homogeneous FL when aligning client-specific representations with global prototypes. These approaches are essentially coordinate alignment, where representations of clients are forced to match the global prototypes in the embedding space in an element-wise manner. Such alignment implicitly assumes that all clients should map their representations into the feature subspace defined by the global prototypes. This assumption is reasonable in homogeneous FL, where all clients share the same feature extractor. However, it becomes problematic in HtFL, since heterogeneous feature extractors naturally induce client-specific feature subspaces, and forcing all clients to optimize within a single global subspace unnecessarily suppresses their learning capacity. We observe that coordinate alignment implicitly couples two distinct objectives: aligning inter-class semantic structure, which is directly beneficial for classification, and enforcing a shared feature basis, which is unnecessary and even harmful under model heterogeneity. Building on this insight, we design FedSAF, which shifts the alignment objective from absolute coordinates to inter-class relational structure. We demonstrate that structural alignment consistently outperforms coordinate alignment in heterogeneous settings. Experiments on multiple benchmarks show that our structural alignment outperforms state-of-the-art prototype-based HtFL methods by up to 3.52\%.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper reframes prototype alignment in heterogeneous FL as structural rather than coordinate matching and reports up to 3.52% gains, but the experiments need tighter controls to show the alignment change is what drives them.

read the letter

The main thing to know about this paper is that it proposes shifting prototype alignment in heterogeneous federated learning from matching exact coordinates to matching inter-class relations, claiming this avoids forcing a shared feature basis across different client models. What is new is the explicit separation of semantic structure from coordinate matching. Prior prototype methods in HtFL reused alignment losses from homogeneous settings, which implicitly require all clients to optimize in the same embedding space. The paper argues this suppresses learning capacity when feature extractors differ, and instead aligns the relational structure between classes. This reframing is reasonable and not directly present in the cited prior work. The paper does well in laying out why coordinate alignment can be harmful in HtFL and in presenting FedSAF as a way to implement structural alignment. The reported gains of up to 3.52% over state-of-the-art methods on benchmarks suggest it could be practically relevant. The soft spots are around the experimental validation. The stress-test concern is valid based on the abstract: without an ablation that keeps the rest of the pipeline the same and only switches the alignment loss from structural back to coordinate-based, it is difficult to attribute the improvements specifically to the new objective. Other design choices in FedSAF might be driving the results. The full paper needs to include such controls and detail the loss formulation to support the claims. This paper is for researchers and engineers working on federated learning with heterogeneous clients, such as in edge computing scenarios with varied devices. A reader focused on practical FL improvements would get value from the targeted mechanism. It shows clear thinking on the problem and engages with the literature, so it deserves a serious referee. The referee can verify the ablations and reproducibility. I recommend sending it to peer review, but with a note to strengthen the isolation of the alignment contribution.

Referee Report

2 major / 1 minor

Summary. The paper argues that existing prototype-based methods in heterogeneous federated learning (HtFL) rely on coordinate alignment (MSE or cosine similarity) of client prototypes to global ones, which implicitly enforces a shared feature basis unsuitable for heterogeneous model architectures. It proposes FedSAF to instead align inter-class relational structures, claiming this decouples beneficial semantic alignment from harmful coordinate constraints and yields consistent gains, outperforming state-of-the-art prototype-based HtFL methods by up to 3.52% across benchmarks.

Significance. If the performance improvements are shown to arise specifically from the structural alignment objective, the work offers a conceptually useful reframing of prototype alignment in HtFL that could guide future methods toward preserving client-specific feature subspaces while still transferring class relations. The distinction between coordinate and structural objectives is a clear contribution, though its significance hinges on whether experiments isolate this factor from other design elements.

major comments (2)

[Experiments] Experiments section: the reported gains of up to 3.52% are not supported by an ablation that fixes all other FedSAF components (prototype computation, optimization schedule, regularization) and reverts only the alignment loss to standard MSE/cosine coordinate matching. Without this control, it is impossible to attribute improvements to the claimed shift from coordinate to structural alignment rather than confounding factors.
[Method] Method section: no explicit loss formulation or derivation is provided for the structural alignment objective (e.g., how inter-class relations are quantified and optimized independently of absolute coordinates), which is load-bearing for the central claim that coordinate alignment couples two distinct objectives.

minor comments (1)

[Abstract] Abstract: the claim of 'multiple benchmarks' is not accompanied by any enumeration of datasets, model architectures, or heterogeneity settings, which hinders immediate assessment of the scope of the empirical results.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. The comments highlight opportunities to strengthen the experimental isolation of our core contribution and to improve the explicitness of the method. We address each major comment below and will revise the manuscript accordingly.

read point-by-point responses

Referee: [Experiments] Experiments section: the reported gains of up to 3.52% are not supported by an ablation that fixes all other FedSAF components (prototype computation, optimization schedule, regularization) and reverts only the alignment loss to standard MSE/cosine coordinate matching. Without this control, it is impossible to attribute improvements to the claimed shift from coordinate to structural alignment rather than confounding factors.

Authors: We agree that the current experiments do not include a controlled ablation that holds every other FedSAF component fixed while swapping only the alignment loss back to coordinate matching. Such an ablation is required to isolate the effect of the structural objective. In the revised manuscript we will add this experiment on the primary benchmarks, reporting accuracy deltas when the structural loss is replaced by MSE and by cosine similarity under identical prototype computation, optimization schedule, and regularization settings. revision: yes
Referee: [Method] Method section: no explicit loss formulation or derivation is provided for the structural alignment objective (e.g., how inter-class relations are quantified and optimized independently of absolute coordinates), which is load-bearing for the central claim that coordinate alignment couples two distinct objectives.

Authors: We accept that the manuscript would benefit from a more self-contained mathematical presentation. The structural alignment objective is described in Section 3 as alignment of inter-class relation matrices (pairwise cosine similarities among prototypes), but the explicit loss expression and its derivation are not written out. In the revision we will insert a dedicated subsection containing (i) the precise loss formula, (ii) the derivation showing how the objective depends only on relative angles and is invariant to client-specific linear transformations of the feature space, and (iii) a short argument clarifying the separation from coordinate constraints. revision: yes

Circularity Check

0 steps flagged

No circularity: new structural alignment objective introduced independently of prior fitted results

full rationale

The paper proposes FedSAF by shifting the alignment loss from coordinate matching (MSE/cosine on prototypes) to inter-class relational structure. This is presented as a design choice motivated by an observation about heterogeneous feature subspaces, not as a mathematical derivation or re-expression of any fitted quantity. No equations reduce a prediction to an input by construction, no parameters are fitted on a subset and then called a prediction, and no self-citation chain is invoked to justify uniqueness or force the method. The central claim rests on experimental comparisons rather than tautological re-labeling of existing results. The derivation chain is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that inter-class relational structure is both sufficient for classification and preferable to coordinate matching under model heterogeneity. No free parameters or invented entities are mentioned in the abstract.

axioms (1)

domain assumption Inter-class relational structure is directly beneficial for classification and can be aligned independently of client-specific feature bases.
This premise is invoked to justify replacing coordinate alignment with structural alignment in heterogeneous settings.

pith-pipeline@v0.9.0 · 5590 in / 1173 out tokens · 30630 ms · 2026-05-08T10:51:16.557353+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

50 extracted references · 7 canonical work pages · 1 internal anchor

[1]

Communication-efficient learning of deep networks from decentralized data,

B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas, “Communication-efficient learning of deep networks from decentralized data,” inArtificial Intelligence and Statistics. PMLR, 2017, pp. 1273– 1282

2017
[2]

Federated learning for generalization, robustness, fairness: A survey and benchmark,

W. Huang, M. Ye, Z. Shi, G. Wan, H. Li, B. Du, and Q. Yang, “Federated learning for generalization, robustness, fairness: A survey and benchmark,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 46, no. 12, pp. 9387–9406, 2024

2024
[3]

Federated optimization in heterogeneous networks,

T. Li, A. K. Sahu, M. Zaheer, M. Sanjabi, A. Talwalkar, and V . Smith, “Federated optimization in heterogeneous networks,”Proceedings of Machine learning and systems, vol. 2, pp. 429–450, 2020

2020
[4]

Scaffold: Stochastic controlled averaging for federated learn- ing,

S. P. Karimireddy, S. Kale, M. Mohri, S. Reddi, S. Stich, and A. T. Suresh, “Scaffold: Stochastic controlled averaging for federated learn- ing,” inInternational Conference on Machine Learning. PMLR, 2020, pp. 5132–5143

2020
[5]

Model-contrastive federated learning,

Q. Li, B. He, and D. Song, “Model-contrastive federated learning,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 10 713–10 722

2021
[6]

Stabilizing and accelerating federated learning on heterogeneous data with partial client participation,

H. Zhang, C. Li, W. Dai, Z. Zheng, J. Zou, and H. Xiong, “Stabilizing and accelerating federated learning on heterogeneous data with partial client participation,”IEEE Transactions on Pattern Analysis and Ma- chine Intelligence, vol. 47, no. 1, pp. 67–83, 2025

2025
[7]

Federated feature augmentation and alignment,

T. Zhou, Y . Yuan, B. Wang, and E. Konukoglu, “Federated feature augmentation and alignment,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 46, no. 12, pp. 11 119–11 135, 2024

2024
[8]

Bold but cautious: Unlocking the potential of personalized federated learning through cautiously aggressive collaboration,

X. Wu, X. Liu, J. Niu, G. Zhu, and S. Tang, “Bold but cautious: Unlocking the potential of personalized federated learning through cautiously aggressive collaboration,” inProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2023, pp. 19 375–19 384

2023
[9]

Decoupling general and personalized knowledge in federated learning via additive and low-rank decomposition,

X. Wu, X. Liu, J. Niu, H. Wang, S. Tang, G. Zhu, and H. Su, “Decoupling general and personalized knowledge in federated learning via additive and low-rank decomposition,” inProceedings of the 32nd ACM International Conference on Multimedia, ser. MM ’24. New York, NY , USA: Association for Computing Machinery, 2024, p. 7172–7181. [Online]. Available: https...

work page doi:10.1145/3664647.3681588 2024
[10]

The diversity bonus: Learning from dissimilar clients in personalized fed- erated learning,

X. Wu, J. Niu, X. Liu, G. Zhu, S. Tang, W. Lin, and J. Cao, “The diversity bonus: Learning from dissimilar clients in personalized fed- erated learning,”IEEE Transactions on Neural Networks and Learning Systems, vol. 36, no. 10, pp. 18 613–18 627, 2025

2025
[11]

Tackling feature-classifier mismatch in federated learning via prompt-driven feature transformation,

X. Wu, X. Liu, J. Niu, G. Zhu, M. Shi, S. Tang, and J. Yuan, “Tackling feature-classifier mismatch in federated learning via prompt-driven feature transformation,” inThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025. [Online]. Available: https://openreview.net/forum?id=vTJFQu5YXz JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8...

2025
[12]

Htfllib: A comprehensive heterogeneous federated learning library and benchmark,

J. Zhang, X. Wu, Y . Zhou, X. Sun, Q. Cai, Y . Liu, Y . Hua, Z. Zheng, J. Cao, and Q. Yang, “Htfllib: A comprehensive heterogeneous federated learning library and benchmark,” inProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2025

2025
[13]

Fedmd: Heterogenous federated learning via model distillation,

D. Li and J. Wang, “Fedmd: Heterogenous federated learning via model distillation,”arXiv preprint arXiv:1910.03581, 2019

work page arXiv 1910
[14]

Ensemble distillation for robust model fusion in federated learning,

T. Lin, L. Kong, S. U. Stich, and M. Jaggi, “Ensemble distillation for robust model fusion in federated learning,”Advances in neural information processing systems, vol. 33, pp. 2351–2363, 2020

2020
[15]

Parameterized knowledge transfer for personalized federated learning,

J. Zhang, S. Guo, X. Ma, H. Wang, W. Xu, and F. Wu, “Parameterized knowledge transfer for personalized federated learning,”Advances in Neural Information Processing Systems, vol. 34, pp. 10 092–10 104, 2021

2021
[16]

Communication-efficient federated learning via knowledge distillation,

C. Wu, F. Wu, L. Lyu, Y . Huang, and X. Xie, “Communication-efficient federated learning via knowledge distillation,”Nature communications, vol. 13, no. 1, p. 2032, 2022

2032
[17]

Group knowledge transfer: Federated learning of large cnns at the edge,

C. He, M. Annavaram, and S. Avestimehr, “Group knowledge transfer: Federated learning of large cnns at the edge,”Advances in neural information processing systems, vol. 33, pp. 14 068–14 080, 2020

2020
[18]

Fedproto: Federated prototype learning across heterogeneous clients,

Y . Tan, G. Long, L. Liu, T. Zhou, Q. Lu, J. Jiang, and C. Zhang, “Fedproto: Federated prototype learning across heterogeneous clients,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 8, 2022, pp. 8432–8440

2022
[19]

Fedtgp: Trainable global prototypes with adaptive-margin-enhanced contrastive learning for data and model heterogeneity in federated learning,

J. Zhang, Y . Liu, Y . Hua, and J. Cao, “Fedtgp: Trainable global prototypes with adaptive-margin-enhanced contrastive learning for data and model heterogeneity in federated learning,” inProceedings of the AAAI conference on artificial intelligence, vol. 38, no. 15, 2024, pp. 16 768–16 776

2024
[20]

Enhancing Visual Representation with Textual Semantics: Textual Semantics-Powered Prototypes for Heterogeneous Federated Learning

X. Wu, J. Niu, X. Liu, G. Zhu, J. Zhang, and S. Tang, “Enhancing visual representation with textual semantics: Textual semantics-powered prototypes for heterogeneous federated learning,” 2025. [Online]. Available: https://arxiv.org/abs/2503.13543

work page internal anchor Pith review Pith/arXiv arXiv 2025
[21]

Aligning before aggregating: Enabling communication efficient cross-domain federated learning via consistent feature extraction,

G. Zhu, X. Liu, S. Tang, and J. Niu, “Aligning before aggregating: Enabling communication efficient cross-domain federated learning via consistent feature extraction,”IEEE Transactions on Mobile Computing, vol. 23, no. 5, pp. 5880–5896, 2024

2024
[22]

Personalized federated learning with feature alignment and classifier collaboration,

J. Xu, X. Tong, and S.-L. Huang, “Personalized federated learning with feature alignment and classifier collaboration,” inThe Eleventh International Conference on Learning Representations, 2023. [Online]. Available: https://openreview.net/forum?id=SXZr8aDKia

2023
[23]

Taming cross- domain representation variance in federated prototype learning with heterogeneous data domains,

L. Wang, J. Bian, L. Zhang, C. Chen, and J. Xu, “Taming cross- domain representation variance in federated prototype learning with heterogeneous data domains,” inThe Thirty-eighth Annual Conference on Neural Information Processing Systems
[24]

Rethinking federated learning with domain shift: A prototype view,

W. Huang, M. Ye, Z. Shi, H. Li, and B. Du, “Rethinking federated learning with domain shift: A prototype view,” in2023 IEEE/CVF Con- ference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2023, pp. 16 312–16 322

2023
[25]

Federated learning from pre-trained models: A contrastive learning approach,

Y . Tan, G. Long, J. Ma, L. Liu, T. Zhou, and J. Jiang, “Federated learning from pre-trained models: A contrastive learning approach,”Advances in neural information processing systems, vol. 35, pp. 19 332–19 344, 2022

2022
[26]

Fedfa: Federated learning with feature anchors to align features and classifiers for heterogeneous data,

T. Zhou, J. Zhang, and D. H. K. Tsang, “Fedfa: Federated learning with feature anchors to align features and classifiers for heterogeneous data,” IEEE Transactions on Mobile Computing, vol. 23, no. 6, pp. 6731–6742, 2024

2024
[27]

Heterogeneous feder- ated learning: State-of-the-art and research challenges,

M. Ye, X. Fang, B. Du, P. C. Yuen, and D. Tao, “Heterogeneous feder- ated learning: State-of-the-art and research challenges,”ACM Computing Surveys, vol. 56, no. 3, pp. 1–44, 2023

2023
[28]

Hetero{fl}: Computation and communication efficient federated learning for heterogeneous clients,

E. Diao, J. Ding, and V . Tarokh, “Hetero{fl}: Computation and communication efficient federated learning for heterogeneous clients,” inInternational Conference on Learning Representations, 2021. [Online]. Available: https://openreview.net/forum?id=TNkPBBYFkXg

2021
[29]

Fedrolex: Model- heterogeneous federated learning with rolling sub-model extraction,

S. Alam, L. Liu, M. Yan, and M. Zhang, “Fedrolex: Model- heterogeneous federated learning with rolling sub-model extraction,” Advances in neural information processing systems, vol. 35, pp. 29 677– 29 690, 2022

2022
[30]

Fiarse: Model-heterogeneous federated learning via importance-aware submodel extraction,

F. Wu, X. Wang, Y . Wang, T. Liu, L. Su, and J. Gao, “Fiarse: Model-heterogeneous federated learning via importance-aware submodel extraction,”Advances in Neural Information Processing Systems, vol. 37, pp. 115 615–115 651, 2024

2024
[31]

Data-Free Knowledge Distillation for Heterogeneous Federated Learning,

Z. Zhu, J. Hong, and J. Zhou, “Data-Free Knowledge Distillation for Heterogeneous Federated Learning,” 2021

2021
[32]

Generalizable heterogeneous fed- erated cross-correlation and instance similarity learning,

W. Huang, M. Ye, Z. Shi, and B. Du, “Generalizable heterogeneous fed- erated cross-correlation and instance similarity learning,”IEEE Trans- actions on Pattern Analysis and Machine Intelligence, vol. 46, no. 2, pp. 712–728, 2023

2023
[33]

Robust asymmetric heterogeneous fed- erated learning with corrupted clients,

X. Fang, M. Ye, and B. Du, “Robust asymmetric heterogeneous fed- erated learning with corrupted clients,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 47, no. 4, pp. 2693–2705, 2025

2025
[34]

Federated mutual learning: a collaborative machine learning method for heterogeneous data, models, and objectives,

T. Shen, J. Zhang, X. Jia, F. Zhang, Z. Lv, K. Kuang, C. Wu, and F. Wu, “Federated mutual learning: a collaborative machine learning method for heterogeneous data, models, and objectives,”Frontiers of Information Technology & Electronic Engineering, vol. 24, no. 10, pp. 1390–1402, 2023

2023
[35]

Federated model heterogeneous matryoshka representation learning,

L. Yi, H. Yu, C. Ren, G. Wang, X. Liet al., “Federated model heterogeneous matryoshka representation learning,”Advances in Neural Information Processing Systems, vol. 37, pp. 66 431–66 454, 2024

2024
[36]

Bridging model heterogeneity in federated learning via uncertainty-based asym- metrical reciprocity learning,

J. Wang, C. Zhao, L. Lyu, Q. You, M. Huai, and F. Ma, “Bridging model heterogeneity in federated learning via uncertainty-based asym- metrical reciprocity learning,” inProceedings of the 41st International Conference on Machine Learning, 2024, pp. 52 290–52 308

2024
[37]

Think locally, act globally: Federated learning with local and global representations,

P. P. Liang, T. Liu, L. Ziyin, N. B. Allen, R. P. Auerbach, D. Brent, R. Salakhutdinov, and L.-P. Morency, “Think locally, act globally: Federated learning with local and global representations,”arXiv preprint arXiv:2001.01523, 2020

work page arXiv 2001
[38]

Fedgh: Heterogeneous federated learning with generalized global header,

L. Yi, G. Wang, X. Liu, Z. Shi, and H. Yu, “Fedgh: Heterogeneous federated learning with generalized global header,” inProceedings of the 31st ACM International Conference on Multimedia, 2023

2023
[39]

Fedssa: semantic similarity-based aggregation for efficient model-heterogeneous personalized federated learning,

L. Yi, H. Yu, Z. Shi, G. Wang, X. Liu, L. Cui, and X. Li, “Fedssa: semantic similarity-based aggregation for efficient model-heterogeneous personalized federated learning,” inProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024, pp. 5371– 5379

2024
[40]

An upload-efficient scheme for transferring knowledge from a server-side pre-trained generator to clients in heterogeneous federated learning,

J. Zhang, Y . Liu, Y . Hua, and J. Cao, “An upload-efficient scheme for transferring knowledge from a server-side pre-trained generator to clients in heterogeneous federated learning,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2024, pp. 12 109–12 119

2024
[41]

Fedsa: A unified representation learning via semantic anchors for prototype-based federated learning,

Y . Zhou, X. Qu, C. You, J. Zhou, J. Tang, X. Zheng, C. Cai, and Y . Wu, “Fedsa: A unified representation learning via semantic anchors for prototype-based federated learning,”arXiv preprint arXiv:2501.05496, 2025

work page arXiv 2025
[42]

Tackling data heterogeneity in federated learning with class prototypes,

Y . Dai, Z. Chen, J. Li, S. Heinecke, L. Sun, and R. Xu, “Tackling data heterogeneity in federated learning with class prototypes,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, no. 6, 2023, pp. 7314–7322

2023
[43]

Cifar-10 (canadian institute for advanced research),

A. Krizhevsky, V . Nair, and G. Hinton, “Cifar-10 (canadian institute for advanced research),”URL http://www. cs. toronto. edu/kriz/cifar. html, vol. 5, 2010

2010
[44]

Learning multiple layers of features from tiny images,

A. Krizhevsky, G. Hintonet al., “Learning multiple layers of features from tiny images,” 2009

2009
[45]

Tiny imagenet visual recognition challenge,

Y . Le and X. Yang, “Tiny imagenet visual recognition challenge,”CS 231N, vol. 7, no. 7, p. 3, 2015

2015
[46]

Deeper, broader and artier domain generalization,

D. Li, Y . Yang, Y .-Z. Song, and T. M. Hospedales, “Deeper, broader and artier domain generalization,” inProceedings of the IEEE International Conference on Computer Vision (ICCV), Oct 2017

2017
[47]

Mo- ment matching for multi-source domain adaptation,

X. Peng, Q. Bai, X. Xia, Z. Huang, K. Saenko, and B. Wang, “Mo- ment matching for multi-source domain adaptation,” inProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2019

2019
[48]

Character-level convolutional net- works for text classification,

X. Zhang, J. Zhao, and Y . LeCun, “Character-level convolutional net- works for text classification,”Advances in neural information processing systems, vol. 28, 2015

2015
[49]

Lichang Chen, Jiuhai Chen, Tom Goldstein, Heng Huang, and Tianyi Zhou

S. Caldas, S. M. K. Duddu, P. Wu, T. Li, J. Kone ˇcn`y, H. B. McMahan, V . Smith, and A. Talwalkar, “Leaf: A benchmark for federated settings,” arXiv preprint arXiv:1812.01097, 2018

work page arXiv 2018
[50]

Communication-efficient on-device machine learning: Federated distil- lation and augmentation under non-iid private data,

E. Jeong, S. Oh, H. Kim, J. Park, M. Bennis, and S.-L. Kim, “Communication-efficient on-device machine learning: Federated dis- tillation and augmentation under non-iid private data,”arXiv preprint arXiv:1811.11479, 2018. APPENDIX A. Proof of Proposition 1 Proof.Recall the coordinate alignment loss Lcoord(Z, P) :=∥ ˆZ− ˆP∥ 2 F ,(22) and define R∗ ∈arg min...

work page arXiv 2018