pith. sign in

arxiv: 2406.10861 · v1 · submitted 2024-06-16 · 💻 cs.LG · cs.DC

Knowledge Distillation in Federated Learning: a Survey on Long Lasting Challenges and New Solutions

Pith reviewed 2026-05-23 23:46 UTC · model grok-4.3

classification 💻 cs.LG cs.DC
keywords federated learningknowledge distillationprivacy preservationdata heterogeneitycommunication efficiencypersonalizationmodel compression
0
0 comments X

The pith

Knowledge distillation transfers logits to address privacy, heterogeneity, communication, and personalization challenges in federated learning.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This survey reviews applications of knowledge distillation in federated learning since 2020. It organizes methods according to how they use logit exchange to mitigate the core problems of traditional FL without moving raw data. The authors present KD as a compression and transfer tool that fits naturally with distributed training constraints. They cover motivation, taxonomy, critical factors, and remaining open issues in the combined approach.

Core claim

The paper establishes that exchanging logits at intermediate or output layers lets KD serve as an effective mechanism for knowledge transfer in FL, directly supporting privacy protection, handling of non-IID data, reduced communication volume, and client personalization.

What carries the argument

A taxonomy of KD-based FL methods organized by distillation location, teacher-student roles, and the specific FL challenge each method targets.

Load-bearing premise

The body of KD-FL papers published since 2020 is representative enough to show that logit exchange reliably solves the listed FL problems.

What would settle it

A set of experiments on standard FL benchmarks where KD variants show no measurable gains in privacy metrics, accuracy under data skew, or bits transmitted compared with non-KD baselines.

Figures

Figures reproduced from arXiv: 2406.10861 by Laiqiao Qin, Philip S. Yu, Tianqing Zhu, Wanlei Zhou.

Figure 1
Figure 1. Figure 1: The typical training process of knowledge distillation [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: The typical training process of federated learning consists of [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: KD uses a teacher-student architecture. Teacher model transfers knowledge to the student model. [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: The FL process consists of six steps: ① preprocessing of the global model by clients, ② local training by clients, ③ further processing of local models by clients, ④ preprocessing of local models by the server upon receiving them, ⑤ aggregation of local models by the server to obtain the global model, and ⑥ further processing of the global model by the server. KD can be employed in all six steps. in FL dur… view at source ↗
Figure 5
Figure 5. Figure 5: Comparison of the three KD-based FL methods. Feature-based FD shares model features, parameter [PITH_FULL_IMAGE:figures/full_fig_p011_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: The typical training process of feature-based FD: [PITH_FULL_IMAGE:figures/full_fig_p012_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: The typical training process of parameter-based FD: [PITH_FULL_IMAGE:figures/full_fig_p013_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: The typical training process of data-based FD: [PITH_FULL_IMAGE:figures/full_fig_p013_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: The relationships between different challenges in FL. Increased communication helps alleviate non-IID [PITH_FULL_IMAGE:figures/full_fig_p016_9.png] view at source ↗
read the original abstract

Federated Learning (FL) is a distributed and privacy-preserving machine learning paradigm that coordinates multiple clients to train a model while keeping the raw data localized. However, this traditional FL poses some challenges, including privacy risks, data heterogeneity, communication bottlenecks, and system heterogeneity issues. To tackle these challenges, knowledge distillation (KD) has been widely applied in FL since 2020. KD is a validated and efficacious model compression and enhancement algorithm. The core concept of KD involves facilitating knowledge transfer between models by exchanging logits at intermediate or output layers. These properties make KD an excellent solution for the long-lasting challenges in FL. Up to now, there have been few reviews that summarize and analyze the current trend and methods for how KD can be applied in FL efficiently. This article aims to provide a comprehensive survey of KD-based FL, focusing on addressing the above challenges. First, we provide an overview of KD-based FL, including its motivation, basics, taxonomy, and a comparison with traditional FL and where KD should execute. We also analyze the critical factors in KD-based FL in the appendix, including teachers, knowledge, data, and methods. We discuss how KD can address the challenges in FL, including privacy protection, data heterogeneity, communication efficiency, and personalization. Finally, we discuss the challenges facing KD-based FL algorithms and future research directions. We hope this survey can provide insights and guidance for researchers and practitioners in the FL area.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper is a survey claiming that knowledge distillation (KD) is an effective approach for addressing long-standing challenges in federated learning (FL) such as privacy risks, data heterogeneity, communication bottlenecks, and system heterogeneity. It structures the review around motivation and basics of KD-based FL, a taxonomy, comparisons to traditional FL, challenge-by-challenge analysis (privacy, heterogeneity, communication, personalization), an appendix covering critical factors (teachers, knowledge, data, methods), and future directions, while asserting that few prior reviews exist on this intersection since 2020.

Significance. If the coverage is accurate and representative, the survey would provide a useful synthesis for the FL community by organizing recent KD applications, offering a taxonomy, and identifying trends and open problems. The explicit structure separating challenge-specific applications from the appendix analysis of factors is a strength, as is the focus on practical integration points for KD in FL.

major comments (3)
  1. [Introduction] Introduction: the assertion that 'few reviews' summarize KD applications in FL should be substantiated by explicitly citing and contrasting against the small number of existing FL or KD surveys (e.g., those on FL challenges or model compression) to support the novelty claim.
  2. [Privacy protection discussion] Section discussing privacy protection: the claim that KD mitigates privacy risks via logit exchange requires explicit discussion of potential leakage vectors from intermediate or output logits, as this is load-bearing for the 'excellent solution' synthesis.
  3. [Taxonomy] Taxonomy section: the proposed taxonomy of KD-based FL methods should clarify coverage of system heterogeneity variants, as the abstract lists this as a core FL challenge yet the taxonomy description focuses primarily on data and communication aspects.
minor comments (2)
  1. [Appendix] Appendix on critical factors: tables or quantitative summaries comparing teacher model choices across cited works would improve clarity and allow readers to assess trends.
  2. [Throughout] Ensure all citations in the challenge-specific sections are dated 2020 or later, consistent with the scope stated in the abstract.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments and positive recommendation for minor revision. We address each major point below and will revise the manuscript accordingly to strengthen the survey.

read point-by-point responses
  1. Referee: [Introduction] the assertion that 'few reviews' summarize KD applications in FL should be substantiated by explicitly citing and contrasting against the small number of existing FL or KD surveys (e.g., those on FL challenges or model compression) to support the novelty claim.

    Authors: We agree that the novelty claim requires stronger substantiation. In the revised version, we will expand the introduction to explicitly cite and contrast with prior surveys on FL challenges (such as those covering privacy, heterogeneity, and communication) and on model compression/KD techniques, highlighting the absence of comprehensive KD-specific FL reviews since 2020 and the unique taxonomy and challenge-by-challenge structure of our work. revision: yes

  2. Referee: [Privacy protection discussion] the claim that KD mitigates privacy risks via logit exchange requires explicit discussion of potential leakage vectors from intermediate or output logits, as this is load-bearing for the 'excellent solution' synthesis.

    Authors: We acknowledge this important nuance. While the current manuscript notes the privacy benefits of logit exchange over raw data sharing, we will add explicit discussion of potential leakage vectors (e.g., model inversion or membership inference risks from logits) in the privacy section to provide a balanced analysis and qualify the 'excellent solution' framing with appropriate caveats and references to related attack literature. revision: yes

  3. Referee: [Taxonomy] the proposed taxonomy of KD-based FL methods should clarify coverage of system heterogeneity variants, as the abstract lists this as a core FL challenge yet the taxonomy description focuses primarily on data and communication aspects.

    Authors: We agree clarification is needed. System heterogeneity is addressed in the manuscript primarily via the personalization challenge section (which includes device-specific adaptations), but the taxonomy overview will be revised to explicitly map system heterogeneity variants (e.g., varying client compute and architectures) to relevant KD-based FL categories and cross-reference the personalization analysis for completeness. revision: yes

Circularity Check

0 steps flagged

No significant circularity: survey without derivations or predictions

full rationale

This is a descriptive survey paper whose central claims consist of overviews, taxonomies, and syntheses of external literature on KD-based FL. No equations, parameter fits, predictions, or uniqueness theorems are presented that could reduce to self-definition or self-citation chains. All analysis draws from cited prior work rather than internal constructions, satisfying the criteria for a self-contained review with no load-bearing circular steps.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is a survey paper with no new mathematical derivations, empirical claims, or modeling assumptions introduced by the authors.

pith-pipeline@v0.9.0 · 5798 in / 1085 out tokens · 27268 ms · 2026-05-23T23:46:13.303249+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Federated Distillation on Edge Devices: Efficient Client-Side Filtering for Non-IID Data

    cs.LG 2025-08 unverdicted novelty 5.0

    EdgeFD uses a KMeans-based client-side filter to improve federated distillation accuracy close to IID levels on non-IID data distributions for resource-constrained edge devices.

Reference graph

Works this paper leans on

197 extracted references · 197 canonical work pages · cited by 1 Pith paper · 8 internal anchors

  1. [1]

    Andrei Afonin and Sai Praneeth Karimireddy. 2021. Towards model agnostic federated learning using knowledge distillation. arXiv preprint arXiv:2110.15210 (2021)

  2. [2]

    Jin-Hyun Ahn, Osvaldo Simeone, and Joonhyuk Kang. 2019. Wireless federated distillation for distributed edge learning with heterogeneous data. In 2019 IEEE 30th Annual International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC). IEEE, 1–6

  3. [3]

    Norah Alballa and Marco Canini. 2023. A First Look at the Impact of Distillation Hyper-Parameters in Federated Knowledge Distillation. In Proceedings of the 3rd Workshop on Machine Learning and Systems . 123–130

  4. [4]

    Zeyuan Allen-Zhu, Yuanzhi Li, and Zhao Song. 2019. A convergence theory for deep learning via over- parameterization. In International conference on machine learning . PMLR, 242–252

  5. [5]

    Shun-ichi Amari. 1993. Backpropagation and stochastic gradient descent method. Neurocomputing 5, 4-5 (1993), 185–196

  6. [6]

    Yoshinori Aono, Takuya Hayashi, Lihua Wang, Shiho Moriai, et al . 2017. Privacy-preserving deep learning via additively homomorphic encryption. IEEE transactions on information forensics and security 13, 5 (2017), 1333–1345

  7. [7]

    Manoj Ghuhan Arivazhagan, Vinay Aggarwal, Aaditya Kumar Singh, and Sunav Choudhary. 2019. Federated learning with personalization layers. arXiv preprint arXiv:1912.00818 (2019)

  8. [8]

    Asia J Biega, Peter Potash, Hal Daumé, Fernando Diaz, and Michèle Finck. 2020. Operationalizing the legal principle of data minimization for personalization. In Proceedings of the 43rd international ACM SIGIR conference on research and development in information retrieval . 399–408

  9. [9]

    Ilai Bistritz, Ariana Mann, and Nicholas Bambos. 2020. Distributed distillation for on-device learning. Advances in Neural Information Processing Systems 33 (2020), 22593–22604

  10. [10]

    Hongyan Chang, Virat Shejwalkar, Reza Shokri, and Amir Houmansadr. 2019. Cronus: Robust and heterogeneous collaborative learning with black-box knowledge transfer. arXiv preprint arXiv:1912.11279 (2019). J. ACM, Vol. 37, No. 4, Article 111. Publication date: August 2024. 111:28 L. Qin, et al

  11. [11]

    Huancheng Chen, Chaining Wang, and Haris Vikalo. 2023. The Best of Both Worlds: Accurate Global and Personalized Models through Federated Learning with Data-Free Hyper-Knowledge Distillation. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net. https://openreview. net/pdf?id=29V3AWjVAFi

  12. [12]

    Hong-You Chen and Wei-Lun Chao. 2020. Fedbe: Making bayesian model ensemble applicable to federated learning. arXiv preprint arXiv:2009.01974 (2020)

  13. [13]

    Mingzhe Chen, Nir Shlezinger, H Vincent Poor, Yonina C Eldar, and Shuguang Cui. 2021. Communication-efficient federated learning. Proceedings of the National Academy of Sciences 118, 17 (2021), e2024789118

  14. [14]

    Yitao Chen, Dawei Chen, Haoxin Wang, Kyungtae Han, and Ming Zhao. 2023. Confidence-Based Federated Distillation for Vision-Based Lane-Centering. In IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2023 - Workshops, Rhodes Island, Greece, June 4-10, 2023 . IEEE, 1–5. https://doi.org/10.1109/ICASSPW59220.2023. 10193741

  15. [15]

    Yiqiang Chen, Wang Lu, Xin Qin, Jindong Wang, and Xing Xie. 2022. Metafed: Federated learning among federations with cyclic knowledge distillation for personalized healthcare. arXiv preprint arXiv:2206.08516 (2022)

  16. [16]

    Yiqiang Chen, Xin Qin, Jindong Wang, Chaohui Yu, and Wen Gao. 2020. Fedhealth: A federated transfer learning framework for wearable healthcare. IEEE Intelligent Systems 35, 4 (2020), 83–93

  17. [17]

    Zheyi Chen, Pu Tian, Weixian Liao, Xuhui Chen, Guobin Xu, and Wei Yu. 2023. Resource-Aware Knowledge Distillation for Federated Learning. IEEE Transactions on Emerging Topics in Computing (2023)

  18. [18]

    Jang Hyun Cho and Bharath Hariharan. 2019. On the efficacy of knowledge distillation. InProceedings of the IEEE/CVF international conference on computer vision . 4794–4802

  19. [19]

    Olivia Choudhury, Aris Gkoulalas-Divanis, Theodoros Salonidis, Issa Sylla, Yoonyoung Park, Grace Hsu, and Amar Das. 2020. A syntactic approach for privacy-preserving federated learning. In ECAI 2020. IOS Press, 1762–1769

  20. [20]

    Yuyang Deng, Mohammad Mahdi Kamani, and Mehrdad Mahdavi. 2020. Adaptive personalized federated learning. arXiv preprint arXiv:2003.13461 (2020)

  21. [21]

    Enmao Diao, Jie Ding, and Vahid Tarokh. 2020. Heterofl: Computation and communication efficient federated learning for heterogeneous clients. arXiv preprint arXiv:2010.01264 (2020)

  22. [22]

    Jiahua Dong, Lixu Wang, Zhen Fang, Gan Sun, Shichao Xu, Xiao Wang, and Qi Zhu. 2022. Federated class-incremental learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition . 10164–10173

  23. [23]

    Wenliang Du and Mikhail J Atallah. 2001. Secure multi-party computation problems and their applications: a review and open problems. In Proceedings of the 2001 workshop on New security paradigms . 13–22

  24. [24]

    Moming Duan, Duo Liu, Xianzhang Chen, Yujuan Tan, Jinting Ren, Lei Qiao, and Liang Liang. 2019. Astraea: Self- balancing federated learning for improving classification accuracy of mobile deep learning applications. In 2019 IEEE 37th international conference on computer design (ICCD) . IEEE, 246–254

  25. [25]

    Cynthia Dwork. 2006. Differential privacy. In International colloquium on automata, languages, and programming . Springer, 1–12

  26. [26]

    Cynthia Dwork. 2008. Differential privacy: A survey of results. In International conference on theory and applications of models of computation . Springer, 1–19

  27. [27]

    Maksim E Eren, Luke E Richards, Manish Bhattarai, Roberto Yus, Charles Nicholas, and Boian S Alexandrov. 2022. Fedsplit: One-shot federated recommendation system based on non-negative joint matrix factorization and knowledge distillation. arXiv preprint arXiv:2205.02359 (2022)

  28. [28]

    Alireza Fallah, Aryan Mokhtari, and Asuman Ozdaglar. 2020. Personalized federated learning with theoretical guarantees: A model-agnostic meta-learning approach. Advances in Neural Information Processing Systems 33 (2020), 3557–3568

  29. [29]

    Xiuwen Fang and Mang Ye. 2022. Robust federated learning with noisy and heterogeneous clients. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . 10072–10081

  30. [30]

    Gad Gad and Zubair Fadlullah. 2023. Federated learning via augmented knowledge distillation for heterogenous deep human activity recognition systems. Sensors 23, 1 (2023), 6

  31. [31]

    Jonas Geiping, Hartmut Bauermeister, Hannah Dröge, and Michael Moeller. 2020. Inverting gradients-how easy is it to break privacy in federated learning? Advances in Neural Information Processing Systems 33 (2020), 16937–16947

  32. [32]

    Avishek Ghosh, Jichan Chung, Dong Yin, and Kannan Ramchandran. 2020. An efficient framework for clustered federated learning. Advances in Neural Information Processing Systems 33 (2020), 19586–19597

  33. [33]

    Oded Goldreich. 1998. Secure multi-party computation. Manuscript. Preliminary version 78, 110 (1998)

  34. [35]

    In Proceedings of the IEEE/CVF International Conference on Computer Vision

    Ensemble attention distillation for privacy-preserving federated learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision . 15076–15086

  35. [36]

    Xuan Gong, Abhishek Sharma, Srikrishna Karanam, Ziyan Wu, Terrence Chen, David Doermann, and Arun Innanje

  36. [37]

    In Proceedings of J

    Preserving privacy in federated learning with ensemble cross-domain knowledge distillation. In Proceedings of J. ACM, Vol. 37, No. 4, Article 111. Publication date: August 2024. Knowledge Distillation in Federated Learning: a Survey on Long Lasting Challenges and New Solutions 111:29 the AAAI Conference on Artificial Intelligence , Vol. 36. 11891–11899

  37. [38]

    Jianping Gou, Baosheng Yu, Stephen J Maybank, and Dacheng Tao. 2021. Knowledge distillation: A survey.International Journal of Computer Vision 129 (2021), 1789–1819

  38. [39]

    Neel Guha, Ameet Talwalkar, and Virginia Smith. 2019. One-shot federated learning. arXiv preprint arXiv:1902.11175 (2019)

  39. [40]

    Shaunak Halbe, James Seale Smith, Junjiao Tian, and Zsolt Kira. 2023. HePCo: Data-Free Heterogeneous Prompt Consolidation for Continual Federated Learning. arXiv preprint arXiv:2306.09970 (2023)

  40. [41]

    Jenny Hamer, Mehryar Mohri, and Ananda Theertha Suresh. 2020. Fedboost: A communication-efficient algorithm for federated learning. In International Conference on Machine Learning . PMLR, 3973–3983

  41. [42]

    Moqbel Hamood, Abdullatif Albaseer, Mohamed Abdallah, and Ala Al-Fuqaha. 2023. Clustered and Multi-Tasked Federated Distillation for Heterogeneous and Resource Constrained Industrial IoT Applications. IEEE Internet of Things Magazine 6, 2 (2023), 64–69

  42. [43]

    Meng Hao, Hongwei Li, Guowen Xu, Sen Liu, and Haomiao Yang. 2019. Towards efficient and privacy-preserving federated deep learning. In ICC 2019-2019 IEEE international conference on communications (ICC) . IEEE, 1–6

  43. [44]

    Chaoyang He, Murali Annavaram, and Salman Avestimehr. 2020. Group knowledge transfer: Federated learning of large cnns at the edge. Advances in Neural Information Processing Systems 33 (2020), 14068–14080

  44. [45]

    Yuting He, Yiqiang Chen, XiaoDong Yang, Hanchao Yu, Yi-Hua Huang, and Yang Gu. 2022. Learning critically: Selective self-distillation in federated learning on non-iid data. IEEE Transactions on Big Data (2022)

  45. [46]

    Yuting He, Yiqiang Chen, Xiaodong Yang, Yingwei Zhang, and Bixiao Zeng. 2022. Class-wise adaptive self distillation for heterogeneous federated learning. In Proceedings of the 36th AAAI Conference on Artificial Intelligence, Virtual , Vol. 22

  46. [47]

    Geoffrey Hinton, Nitish Srivastava, and Kevin Swersky. 2012. Neural networks for machine learning lecture 6a overview of mini-batch gradient descent. Cited on 14, 8 (2012), 2

  47. [48]

    Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)

  48. [49]

    Haley Hoech, Roman Rischke, Karsten Müller, and Wojciech Samek. 2022. FedAUXfdp: Differentially Private One-Shot Federated Distillation. In International Workshop on Trustworthy Federated Learning . Springer, 100–114

  49. [50]

    Samuel Horvath, Stefanos Laskaridis, Mario Almeida, Ilias Leontiadis, Stylianos Venieris, and Nicholas Lane. 2021. Fjord: Fair and accurate federated learning under heterogeneous targets with ordered dropout. Advances in Neural Information Processing Systems 34 (2021), 12876–12889

  50. [51]

    Rui Hu, Yuanxiong Guo, Hongning Li, Qingqi Pei, and Yanmin Gong. 2020. Personalized federated learning with differential privacy. IEEE Internet of Things Journal 7, 10 (2020), 9530–9539

  51. [52]

    Chung-ju Huang, Leye Wang, and Xiao Han. 2023. Vertical Federated Knowledge Transfer via Representation Distillation for Healthcare Collaboration Networks. In Proceedings of the ACM Web Conference 2023 . 4188–4199

  52. [53]

    Chun-Yin Huang, Ruinan Jin, Can Zhao, Daguang Xu, and Xiaoxiao Li. 2023. Federated Virtual Learning on Heterogeneous Data with Local-global Distillation. arXiv preprint arXiv:2303.02278 (2023)

  53. [54]

    Wei Huang, Tianrui Li, Dexian Wang, Shengdong Du, and Junbo Zhang. 2020. Fairness and accuracy in federated learning. arXiv preprint arXiv:2012.10069 (2020)

  54. [55]

    Wenke Huang, Mang Ye, and Bo Du. 2022. Learn from others and be yourself in heterogeneous federated learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . 10143–10153

  55. [56]

    Yutao Huang, Lingyang Chu, Zirui Zhou, Lanjun Wang, Jiangchuan Liu, Jian Pei, and Yong Zhang. 2021. Personalized cross-silo federated learning on non-iid data. In Proceedings of the AAAI conference on artificial intelligence , Vol. 35. 7865–7873

  56. [57]

    Yangsibo Huang, Samyak Gupta, Zhao Song, Kai Li, and Sanjeev Arora. 2021. Evaluating gradient inversion attacks and defenses in federated learning. Advances in Neural Information Processing Systems 34 (2021), 7232–7241

  57. [58]

    Yue Huang, Lanju Kong, Qingzhong Li, and Baochen Zhang. 2023. Decentralized Federated Learning Via Mutual Knowledge Distillation. In 2023 IEEE International Conference on Multimedia and Expo (ICME) . IEEE, 342–347

  58. [59]

    Ya-Lin Huang, Hao-Chun Yang, and Chi-Chun Lee. 2021. Federated learning via conditional mutual learning for Alzheimer’s disease classification on T1w MRI. In2021 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC) . IEEE, 2427–2432

  59. [60]

    Sohei Itahara, Takayuki Nishio, Yusuke Koda, Masahiro Morikura, and Koji Yamamoto. 2021. Distillation-based semi-supervised federated learning for communication-efficient collaborative training with non-iid private data. IEEE Transactions on Mobile Computing 22, 1 (2021), 191–205

  60. [61]

    Eunjeong Jeong, Seungeun Oh, Hyesung Kim, Jihong Park, Mehdi Bennis, and Seong-Lyun Kim. 2018. Communication- efficient on-device machine learning: Federated distillation and augmentation under non-iid private data. arXiv preprint arXiv:1811.11479 (2018). J. ACM, Vol. 37, No. 4, Article 111. Publication date: August 2024. 111:30 L. Qin, et al

  61. [62]

    Hai Jin, Dongshan Bai, Dezhong Yao, Yutong Dai, Lin Gu, Chen Yu, and Lichao Sun. 2022. Personalized edge intelligence via federated self-knowledge distillation. IEEE Transactions on Parallel and Distributed Systems 34, 2 (2022), 567–580

  62. [63]

    Changxing Jing, Yan Huang, Yihong Zhuang, Liyan Sun, Zhenlong Xiao, Yue Huang, and Xinghao Ding. 2023. Exploring personalization via federated representation Learning on non-IID data. Neural Networks 163 (2023), 354–366

  63. [64]

    Attila Kádár and Dániel Hadházi. 2022. FedLinked: A client-wise distilled representation based semi-supervised collaborative multitask learning scheme. In 2022 International Joint Conference on Neural Networks (IJCNN) . IEEE, 1–8

  64. [65]

    Peter Kairouz, H Brendan McMahan, Brendan Avent, Aurélien Bellet, Mehdi Bennis, Arjun Nitin Bhagoji, Kallista Bonawitz, Zachary Charles, Graham Cormode, Rachel Cummings, et al . 2021. Advances and open problems in federated learning. Foundations and Trends® in Machine Learning 14, 1–2 (2021), 1–210

  65. [66]

    Myeongkyun Kang, Philip Chikontwe, Soopil Kim, Kyong Hwan Jin, Ehsan Adeli, Kilian M Pohl, and Sang Hyun Park. 2023. One-Shot Federated Learning on Medical Data Using Knowledge Distillation with Image Synthesis and Client Model Adaptation. In International Conference on Medical Image Computing and Computer-Assisted Intervention . Springer, 521–531

  66. [67]

    Sai Praneeth Karimireddy, Martin Jaggi, Satyen Kale, Mehryar Mohri, Sashank Reddi, Sebastian U Stich, and Ananda Theertha Suresh. 2021. Breaking the centralized barrier for cross-device federated learning. Advances in Neural Information Processing Systems 34 (2021), 28663–28676

  67. [68]

    Sai Praneeth Karimireddy, Satyen Kale, Mehryar Mohri, Sashank Reddi, Sebastian Stich, and Ananda Theertha Suresh

  68. [69]

    In International conference on machine learning

    Scaffold: Stochastic controlled averaging for federated learning. In International conference on machine learning . PMLR, 5132–5143

  69. [70]

    Sai Praneeth Karimireddy, Satyen Kale, Mehryar Mohri, Sashank J Reddi, Sebastian U Stich, and Ananda Theertha Suresh. 2019. SCAFFOLD: Stochastic Controlled Averaging for On-Device Federated Learning. (2019)

  70. [71]

    Jinkyu Kim, Geeho Kim, and Bohyung Han. 2022. Multi-level branched regularization for federated learning. In International Conference on Machine Learning . PMLR, 11058–11073

  71. [72]

    Young Geun Kim and Carole-Jean Wu. 2021. Autofl: Enabling heterogeneity-aware energy efficient federated learning. In MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture . 183–198

  72. [73]

    Jakub Konečn`y, H Brendan McMahan, Felix X Yu, Peter Richtárik, Ananda Theertha Suresh, and Dave Bacon. 2016. Federated learning: Strategies for improving communication efficiency. arXiv preprint arXiv:1610.05492 (2016)

  73. [74]

    Viraj Kulkarni, Milind Kulkarni, and Aniruddha Pant. 2020. Survey of personalization techniques for federated learning. In 2020 Fourth World Conference on Smart Trends in Systems, Security and Sustainability (WorldS4) . IEEE, 794–797

  74. [75]

    Maximilian Lam, Gu-Yeon Wei, David Brooks, Vijay Janapa Reddi, and Michael Mitzenmacher. 2021. Gradient disaggregation: Breaking privacy in federated learning by reconstructing the user participant matrix. In International Conference on Machine Learning . PMLR, 5959–5968

  75. [76]

    Gihun Lee, Minchan Jeong, Yongjin Shin, Sangmin Bae, and Se-Young Yun. 2022. Preservation of the global knowledge by not-true distillation in federated learning. Advances in Neural Information Processing Systems 35 (2022), 38461– 38474

  76. [77]

    Kibok Lee, Kimin Lee, Jinwoo Shin, and Honglak Lee. 2019. Overcoming catastrophic forgetting with unlabeled data in the wild. In Proceedings of the IEEE/CVF International Conference on Computer Vision . 312–321

  77. [78]

    Shao-Ming Lee and Ja-Ling Wu. 2023. FedUA: An Uncertainty-Aware Distillation-Based Federated Learning Scheme for Image Classification. Information 14, 4 (2023), 234

  78. [79]

    Daliang Li and Junpu Wang. 2019. Fedmd: Heterogenous federated learning via model distillation. arXiv preprint arXiv:1910.03581 (2019)

  79. [80]

    Qinbin Li, Yiqun Diao, Quan Chen, and Bingsheng He. 2022. Federated learning on non-iid data silos: An experimental study. In 2022 IEEE 38th International Conference on Data Engineering (ICDE) . IEEE, 965–978

  80. [81]

    Qinbin Li, Bingsheng He, and Dawn Song. 2020. Practical one-shot federated learning for cross-silo setting. arXiv preprint arXiv:2010.01017 (2020)

Showing first 80 references.