pith. sign in

arxiv: 2412.00452 · v2 · submitted 2024-11-30 · 💻 cs.LG · cs.CV

Learning Locally, Revising Globally: Global Reviser for Federated Learning with Noisy Labels

Pith reviewed 2026-05-23 08:27 UTC · model grok-4.3

classification 💻 cs.LG cs.CV
keywords federated learningnoisy labelslabel correctionglobal reviserrobustnessdata heterogeneityFedGR
0
0 comments X

The pith

The global model in federated learning memorizes noisy labels slowly, allowing a reviser to correct them across clients without extra data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper establishes that the aggregated global model in federated learning learns clean patterns before fitting noisy labels. The authors use this property to build FedGR, a method with three modules that revises noisy labels on each client and regularizes local training. The approach works in a self-contained way even when noise types, ratios, and data distributions differ across clients. If the claim holds, federated systems could train on imperfect real-world labels without central clean data or client-specific noise detectors.

Core claim

The central claim is that by exploiting the inherent property that the global model of FL exhibits a slow memorization of noisy labels, FedGR improves the label-noise robustness of FL in a self-contained manner through three modules that collaboratively rectify noisy labels and regularize local training.

What carries the argument

FedGR, the Federated Global Reviser consisting of three modules that use global model predictions to rectify noisy labels and regularize local training.

If this is right

  • FedGR outperforms seven state-of-the-art baselines on three F-LN benchmarks even with severe label noise and data heterogeneity.
  • The method rectifies labels and regularizes training without requiring clean validation data or knowledge of noise characteristics.
  • Performance remains stable across varying noise types, ratios, and non-identical client data distributions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The slow-memorization property might be tested in other aggregation schemes that combine models from separate sources.
  • Similar global revision steps could address client-specific problems such as concept drift or class imbalance.
  • More frequent global updates might amplify the robustness benefit in deployed federated systems.

Load-bearing premise

The global model maintains reliable predictions and robust representations even when clients have different label-noise types, ratios, and data distributions.

What would settle it

An experiment in which the global model memorizes noisy labels at the same rate as local models, or in which FedGR shows no accuracy gain under high heterogeneity of noise and data.

Figures

Figures reproduced from arXiv: 2412.00452 by Gang Niu, Jiancheng Lv, Jian Wang, Mouxing Yang, Qing Ye, Tongliang Liu, Yuhao Zhou, Yuxin Tian.

Figure 1
Figure 1. Figure 1: Observation: The global model of FL memorizes label￾noise data slowly and learns underlying correct knowledge. According to (a), the global model of FL memorizes no more than 30% label-noise samples over the whole training, and as a contrast the typical centrally trained model memorizes over 80% label￾noise data. Following (b), the global model does not encounter the test performance drop like the centrall… view at source ↗
Figure 2
Figure 2. Figure 2: Illustration of FedGR, which effectively leverages the characteristics of the global model to achieve robust F-LNL. It performs [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Local EMA model updating process on client [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: The (a) and (b) are the pearson coefficient analysis of [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Memorization effect observation experimental results with FL client sample ratio 0.2. [PITH_FULL_IMAGE:figures/full_fig_p013_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Memorization effect observation experimental results with FL client sample ratio to 0.5. [PITH_FULL_IMAGE:figures/full_fig_p013_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Memorization effect observation experimental results with FL client sample ratio 1.0. [PITH_FULL_IMAGE:figures/full_fig_p013_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: The difference between the overfitting degree of the client’s local model on the client’s noisy labels and that of the global model on [PITH_FULL_IMAGE:figures/full_fig_p014_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: The difference between the overfitting degree of the client’s local model on the client’s noisy labels and that of the global model on [PITH_FULL_IMAGE:figures/full_fig_p014_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: The difference between the overfitting degree of the client’s local model on the client’s noisy labels and that of the global model [PITH_FULL_IMAGE:figures/full_fig_p014_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: The difference between the overfitting degree of the client’s local model on the client’s noisy labels and that of the global model [PITH_FULL_IMAGE:figures/full_fig_p015_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: The difference between the overfitting degree of the client’s local model on the client’s noisy labels and that of the global model [PITH_FULL_IMAGE:figures/full_fig_p015_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: The difference between the overfitting degree of the client’s local model on the client’s noisy labels and that of the global model [PITH_FULL_IMAGE:figures/full_fig_p015_13.png] view at source ↗
Figure 15
Figure 15. Figure 15: Visualization of the Non.I.I.D Dirichlet data partition [PITH_FULL_IMAGE:figures/full_fig_p018_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: Visualization of the I.I.D data partition used by Fed [PITH_FULL_IMAGE:figures/full_fig_p019_16.png] view at source ↗
read the original abstract

Conventioanl federated learning (FL) heavily depends on high-quality labels, which are often impractical in the real world, leading to the federated label-noise (F-LN) problem. Worsely, the F-LN problem is exacerbated by the heterogeneity of FL, whereas clients experience different labelnoise types, ratios, and data distribution. In this study, we first observe an intriguing phenomenon that the global model of FL exhibits a slow memorization of noisy labels, suggesting its ability to maintain reliable predictions and robust representations in FL. Motivated on this, we propose a novel method termed Federated Global Reviser (FedGR), a straightforward yet effective method comprising three modules that collaboratively rectify noisy labels and regularize local training. By exploiting above inherent property, FedGR improve the label-noise robustness of FL in a self-contained manner. Extensive experiments on three widely used F-LN benchmarks demonstrate the superior performance of FedGR, consistently outperforming seven state-of-the-art baselines even in severe label-noise and data heterogeneity. Code will be released as soon as possible.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes Federated Global Reviser (FedGR) to address label noise in federated learning (F-LN) under client heterogeneity. It first observes that the global model exhibits slow memorization of noisy labels, preserving reliable predictions and robust representations. Motivated by this, FedGR introduces three collaborative modules for global label rectification and local training regularization. Experiments on three F-LN benchmarks report consistent outperformance over seven baselines even under severe noise and heterogeneity.

Significance. If the slow-memorization property holds under arbitrary per-client noise types, ratios, and distributions, the work supplies a self-contained, assumption-light approach to label-noise robustness in FL. This is potentially significant for practical FL deployments where clean labels are unavailable and heterogeneity is the norm; the absence of free parameters or external data requirements strengthens the contribution relative to prior methods that rely on clean validation sets or noise-model assumptions.

major comments (2)
  1. [Abstract / §1] Abstract and §1 (motivation): the central claim that the global model 'maintains reliable predictions and robust representations' under fully heterogeneous per-client noise types/ratios is load-bearing for the three-module FedGR design, yet the manuscript supplies no derivation or aggregation analysis showing why averaging preserves this property once client-specific noise patterns are no longer averaged out; the stress-test concern therefore lands and requires explicit empirical isolation or theoretical support.
  2. [§4] §4 (experiments): the abstract states 'consistent outperformance' on three benchmarks but reports no quantitative tables, error bars, statistical tests, or exclusion criteria; without these details the superiority claim cannot be verified and the cross-benchmark generalization argument is weakened.
minor comments (2)
  1. [Abstract] Abstract contains multiple typos and grammatical issues: 'Conventioanl' → 'Conventional', 'Worsely' → 'Worse', 'labelnoise' → 'label noise', 'Motivated on this' → 'Motivated by this', 'improve' → 'improves'.
  2. [Abstract] The phrase 'Code will be released as soon as possible' is vague; a concrete timeline or repository link would improve reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. Below we provide point-by-point responses to the major comments and indicate planned revisions.

read point-by-point responses
  1. Referee: [Abstract / §1] Abstract and §1 (motivation): the central claim that the global model 'maintains reliable predictions and robust representations' under fully heterogeneous per-client noise types/ratios is load-bearing for the three-module FedGR design, yet the manuscript supplies no derivation or aggregation analysis showing why averaging preserves this property once client-specific noise patterns are no longer averaged out; the stress-test concern therefore lands and requires explicit empirical isolation or theoretical support.

    Authors: We acknowledge that the manuscript presents the slow-memorization observation primarily as an empirical finding without a formal derivation of the federated averaging effect under per-client heterogeneous noise. To address this, we will add a dedicated subsection in §3 that includes additional controlled experiments isolating the global model's behavior across varying per-client noise types, ratios, and distributions. These stress tests will empirically demonstrate the preservation of reliable predictions and robust representations, thereby strengthening the motivation for the three collaborative modules in FedGR. revision: yes

  2. Referee: [§4] §4 (experiments): the abstract states 'consistent outperformance' on three benchmarks but reports no quantitative tables, error bars, statistical tests, or exclusion criteria; without these details the superiority claim cannot be verified and the cross-benchmark generalization argument is weakened.

    Authors: The full manuscript in §4 already contains quantitative results across the three F-LN benchmarks in Tables 1–3, with mean performance and standard deviations computed over multiple random seeds. No data exclusion criteria were applied beyond the standard benchmark protocols described. We will revise the text to explicitly reference these tables, add a brief description of the multi-seed protocol, and include a note on statistical significance (e.g., paired t-tests) to make the superiority claims fully verifiable. revision: partial

Circularity Check

0 steps flagged

No significant circularity; method motivated by empirical observation without self-referential derivations

full rationale

The paper presents FedGR as motivated by an observed empirical phenomenon (slow memorization of noisy labels by the global model) rather than any mathematical derivation chain. No equations, fitted parameters renamed as predictions, self-citations as load-bearing premises, or ansatzes are described in the provided text. The approach is framed as self-contained via the observed property and validated through experiments on external benchmarks, with no reduction of results to inputs by construction. This is the expected non-finding for an empirical method paper.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review reveals no explicit free parameters, axioms, or invented entities; the approach rests on the stated observation of slow global memorization and the design of three unspecified modules.

pith-pipeline@v0.9.0 · 5742 in / 973 out tokens · 24079 ms · 2026-05-23T08:27:55.073178+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

39 extracted references · 39 canonical work pages · 1 internal anchor

  1. [1]

    O’Connor, and Kevin McGuinness

    Eric Arazo, Diego Ortego, Paul Albert, Noel E. O’Connor, and Kevin McGuinness. Unsupervised Label Noise Modeling and Loss Correction. In Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9-15 June 2019, Long Beach, California, USA, pages 312–321. PMLR,

  2. [2]

    Kanwal, Tegan Maharaj, Asja Fischer, Aaron C

    Devansh Arpit, Stanislaw Jastrzebski, Nicolas Ballas, David Krueger, Emmanuel Bengio, Maxinder S. Kanwal, Tegan Maharaj, Asja Fischer, Aaron C. Courville, Yoshua Bengio, and Simon Lacoste-Julien. A Closer Look at Memorization in Deep Networks. In Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, NSW, Australia, 6-11 ...

  3. [3]

    Goodfellow, Nico- las Papernot, Avital Oliver, and Colin Raffel

    David Berthelot, Nicholas Carlini, Ian J. Goodfellow, Nico- las Papernot, Avital Oliver, and Colin Raffel. MixMatch: A Holistic Approach to Semi-Supervised Learning. In Ad- vances in Neural Information Processing Systems 32: An- nual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, pages 5...

  4. [4]

    Exploring Simple Siamese Representation Learning

    Xinlei Chen and Kaiming He. Exploring Simple Siamese Representation Learning. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, Virtual, June 19- 25, 2021, pages 15750–15758. Computer Vision Foundation / IEEE, 2021. 6

  5. [5]

    Learning with Instance-Dependent La- bel Noise: A Sample Sieve Approach

    Hao Cheng, Zhaowei Zhu, Xingyu Li, Yifei Gong, Xing Sun, and Yang Liu. Learning with Instance-Dependent La- bel Noise: A Sample Sieve Approach. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. OpenReview.net, 2021. 2

  6. [6]

    RandAugment: Practical Automated Data Augmenta- tion with a Reduced Search Space

    Ekin Dogus Cubuk, Barret Zoph, Jonathon Shlens, and Quoc Le. RandAugment: Practical Automated Data Augmenta- tion with a Reduced Search Space. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, Virtual, 2020. 7

  7. [7]

    Tsang, and Masashi Sugiyama

    Bo Han, Quanming Yao, Xingrui Yu, Gang Niu, Miao Xu, Weihua Hu, Ivor W. Tsang, and Masashi Sugiyama. Co- teaching: Robust training of deep neural networks with ex- tremely noisy labels. In Advances in Neural Information Processing Systems 31: Annual Conference on Neural Infor- mation Processing Systems 2018, NeurIPS 2018, December 3-8, 2018, Montr´eal, Ca...

  8. [8]

    Deep residual learning for image recognition

    Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV , USA, June 27-30, 2016, pages 770–778. IEEE Computer Society, 2016. 12

  9. [9]

    Ball, Katie S

    Jeremy Irvin, Pranav Rajpurkar, Michael Ko, Yifan Yu, Sil- viana Ciurea-Ilcus, Christopher Chute, Henrik Marklund, Be- hzad Haghgoo, Robyn L. Ball, Katie S. Shpanskaya, Jayne Seekins, David A. Mong, Safwan S. Halabi, Jesse K. Sand- berg, Ricky Jones, David B. Larson, Curtis P. Langlotz, Bhavik N. Patel, Matthew P. Lungren, and Andrew Y . Ng. CheXpert: A L...

  10. [10]

    FedFixer: Mitigating Hetero- geneous Label Noise in Federated Learning

    Xinyuan Ji, Zhaowei Zhu, Wei Xi, Olga Gadyatskaya, Zilong Song, Yong Cai, and Yang Liu. FedFixer: Mitigating Hetero- geneous Label Noise in Federated Learning. In Thirty-Eighth AAAI Conference on Artificial Intelligence, AAAI 2024, Thirty- Sixth Conference on Innovative Applications of Artificial In- telligence, IAAI 2024, Fourteenth Symposium on Educatio...

  11. [11]

    To- wards Federated Learning against Noisy Labels via Local Self-Regularization

    Xuefeng Jiang, Sheng Sun, Yuwei Wang, and Min Liu. To- wards Federated Learning against Noisy Labels via Local Self-Regularization. In Proceedings of the 31st ACM Inter- national Conference on Information & Knowledge Manage- ment, Atlanta, GA, USA, October 17-21, 2022, pages 862–873. ACM, 2022. 3

  12. [12]

    Tack- ling Noisy Clients in Federated Learning with End-to-end Label Correction

    Xuefeng Jiang, Sheng Sun, Jia Li, Jingjing Xue, Runhan Li, Zhiyuan Wu, Gang Xu, Yuwei Wang, and Min Liu. Tack- ling Noisy Clients in Federated Learning with End-to-end Label Correction. In Proceedings of the 33rd ACM Interna- tional Conference on Information and Knowledge Manage- ment, pages 1015–1026, Boise ID USA, 2024. ACM. 2

  13. [13]

    Makowski, Daniel Rueckert, and Rickmer Braren

    Georgios Kaissis, Marcus R. Makowski, Daniel Rueckert, and Rickmer Braren. Secure, privacy-preserving and federated machine learning in medical imaging. Nat. Mach. Intell., 2 (6):305–311, 2020. 1

  14. [14]

    FedRN: Exploiting k-Reliable Neighbors Towards Robust Federated Learning

    Sangmook Kim, Wonyoung Shin, Soohyuk Jang, Hwanjun Song, and Se-Young Yun. FedRN: Exploiting k-Reliable Neighbors Towards Robust Federated Learning. In Proceed- ings of the 31st ACM International Conference on Informa- tion & Knowledge Management, Atlanta, GA, USA, October 17-21, 2022, pages 972–981. ACM, 2022. 1, 2, 3, 8

  15. [15]

    Learning multiple layers of features from tiny images

    Alex Krizhevsky and Geoffrey Hinton. Learning multiple layers of features from tiny images. Master’s thesis, 2009. 7, 12, 16

  16. [16]

    Junnan Li, Richard Socher, and Steven C. H. Hoi. DivideMix: Learning with Noisy Labels as Semi-supervised Learning. In 8th International Conference on Learning Representations, 9 ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. Open- Review.net, 2020. 1, 2, 5, 7, 16

  17. [17]

    FedDiv: Collaborative Noise Filtering for Federated Learning with Noisy Labels

    Jichang Li, Guanbin Li, Hui Cheng, Zicheng Liao, and Yizhou Yu. FedDiv: Collaborative Noise Filtering for Federated Learning with Noisy Labels. In Thirty-Eighth AAAI Con- ference on Artificial Intelligence, AAAI 2024, Thirty-Sixth Conference on Innovative Applications of Artificial Intelli- gence, IAAI 2024, Fourteenth Symposium on Educational Advances in...

  18. [18]

    Fed- erated learning on non-iid data silos: An experimental study

    Qinbin Li, Yiqun Diao, Quan Chen, and Bingsheng He. Fed- erated learning on non-iid data silos: An experimental study. In 2022 IEEE 38th International Conference on Data Engi- neering (ICDE), pages 965–978. IEEE, 2022. 2, 7, 8, 12, 16

  19. [19]

    Federated Optimization in Heterogeneous Networks

    Tian Li, Anit Kumar Sahu, Manzil Zaheer, Maziar Sanjabi, Ameet Talwalkar, and Virginia Smith. Federated Optimization in Heterogeneous Networks. In Proceedings of the Third Conference on Machine Learning and Systems, MLSys 2020, Austin, TX, USA, March 2-4, 2020. mlsys.org, 2020. 1, 7, 16

  20. [20]

    Provably End-to-end Label-noise Learning with- out Anchor Points

    Xuefeng Li, Tongliang Liu, Bo Han, Gang Niu, and Masashi Sugiyama. Provably End-to-end Label-noise Learning with- out Anchor Points. In Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18-24 July 2021, Virtual Event, pages 6403–6413. PMLR, 2021. 3

  21. [21]

    Federated Learn- ing with Extremely Noisy Clients via Negative Distillation

    Yang Lu, Lin Chen, Yonggang Zhang, Yiliang Zhang, Bo Han, Yiu-ming Cheung, and Hanzi Wang. Federated Learn- ing with Extremely Noisy Clients via Negative Distillation. In Thirty-Eighth AAAI Conference on Artificial Intelligence, AAAI 2024, Thirty-Sixth Conference on Innovative Applica- tions of Artificial Intelligence, IAAI 2024, Fourteenth Sym- posium on...

  22. [22]

    Dick, and Akhil Mathur

    Ekdeep Singh Lubana, Chi Ian Tang, Fahim Kawsar, Robert P. Dick, and Akhil Mathur. Orchestra: Unsupervised Federated Learning via Globally Consistent Clustering. In Interna- tional Conference on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Maryland, USA, pages 14461–14484. PMLR, 2022. 6

  23. [23]

    Communication- Efficient Learning of Deep Networks from Decentralized Data

    Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Ag ¨uera y Arcas. Communication- Efficient Learning of Deep Networks from Decentralized Data. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, AISTATS 2017, 20- 22 April 2017, Fort Lauderdale, FL, USA, pages 1273–1282. PMLR, 2017. 1, 3, 4, 7, 12

  24. [24]

    Deep Learning is Robust to Massive Label Noise

    David Rolnick, Andreas Veit, Serge J. Belongie, and Nir Shavit. Deep Learning is Robust to Massive Label Noise. CoRR, abs/1705.10694, 2017. 2

  25. [25]

    FixMatch: Simplifying Semi- Supervised Learning with Consistency and Confidence

    Kihyuk Sohn, David Berthelot, Nicholas Carlini, Zizhao Zhang, Han Zhang, Colin Raffel, Ekin Dogus Cubuk, Alexey Kurakin, and Chun-Liang Li. FixMatch: Simplifying Semi- Supervised Learning with Consistency and Confidence. In Advances in Neural Information Processing Systems 33: An- nual Conference on Neural Information Processing Systems 2020, NeurIPS 2020...

  26. [26]

    Learning From Noisy Labels With Deep Neural Networks: A Survey

    Hwanjun Song, Minseok Kim, Dongmin Park, Yooju Shin, and Jae-Gil Lee. Learning From Noisy Labels With Deep Neural Networks: A Survey. IEEE Trans. Neural Networks Learn. Syst., 34(11):8135–8153, 2023. 7

  27. [27]

    A Survey on Federated Recommendation Systems

    Zehua Sun, Yonghui Xu, Yong Liu, Wei He, Lanju Kong, Fangzhao Wu, Yali Jiang, and Lizhen Cui. A Survey on Federated Recommendation Systems. IEEE Transactions on Neural Networks and Learning Systems, pages 1–15, 2024. 1

  28. [28]

    FedCoop: Cooperative Federated Learning for Noisy Labels

    Kahou Tam, Li Li, Yan Zhao, and Chengzhong Xu. FedCoop: Cooperative Federated Learning for Noisy Labels. In ECAI 2023 - 26th European Conference on Artificial Intelligence, September 30 - October 4, 2023, Krak´ow, Poland - Including 12th Conference on Prestigious Applications of Intelligent Systems (PAIS 2023), pages 2298–2306. IOS Press, 2023. 2, 3

  29. [29]

    FedNoiL: A Simple Two-Level Sampling Method for Federated Learning with Noisy Labels

    Zhuowei Wang, Tianyi Zhou, Guodong Long, Bo Han, and Jing Jiang. FedNoiL: A Simple Two-Level Sampling Method for Federated Learning with Noisy Labels. CoRR, abs/2205.10110, 2022. 1, 2, 3

  30. [30]

    FedNoRo: Towards Noise-Robust Federated Learning by Addressing Class Imbalance and Label Noise Heterogeneity

    Nannan Wu, Li Yu, Xuefeng Jiang, Kwang-Ting Cheng, and Zengqiang Yan. FedNoRo: Towards Noise-Robust Federated Learning by Addressing Class Imbalance and Label Noise Heterogeneity. In Proceedings of the Thirty-Second Inter- national Joint Conference on Artificial Intelligence, IJCAI 2023, 19th-25th August 2023, Macao, SAR, China , pages 4424–4432. ijcai.or...

  31. [31]

    ProMix: Combating Label Noise via Maximizing Clean Sample Utility

    Ruixuan Xiao, Yiwen Dong, Haobo Wang, Lei Feng, Runze Wu, Gang Chen, and Junbo Zhao. ProMix: Combating Label Noise via Maximizing Clean Sample Utility. In Proceedings of the Thirty-Second International Joint Conference on Artifi- cial Intelligence, IJCAI 2023, 19th-25th August 2023, Macao, SAR, China, pages 4442–4450. ijcai.org, 2023. 2

  32. [32]

    Learning from massive noisy labeled data for image classification

    Tong Xiao, Tian Xia, Yi Yang, Chang Huang, and Xiaogang Wang. Learning from massive noisy labeled data for image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2691–2699,

  33. [33]

    Jingyi Xu, Zihan Chen, Tony Q. S. Quek, and Kai Fong Ernest Chong. FedCorr: Multi-Stage Federated Learning for Label Noise Correction. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18-24, 2022, pages 10174–10183. IEEE, 2022. 1, 2, 3, 4, 7, 8, 16, 19

  34. [34]

    Robust Federated Learning With Noisy La- bels

    Seunghan Yang, Hyoungseob Park, Junyoung Byun, and Changick Kim. Robust Federated Learning With Noisy La- bels. IEEE Intell. Syst., 37(2):35–43, 2022. 1, 2, 3

  35. [35]

    Tsang, and Masashi Sugiyama

    Xingrui Yu, Bo Han, Jiangchao Yao, Gang Niu, Ivor W. Tsang, and Masashi Sugiyama. How does Disagreement Help Gen- eralization against Label Corruption? In Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9-15 June 2019, Long Beach, California, USA, pages 7164–7173. PMLR, 2019. 2

  36. [36]

    Understanding deep learning requires rethinking generalization

    Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, and Oriol Vinyals. Understanding deep learning requires rethinking generalization. In 5th International Conference on 10 Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. OpenReview.net,

  37. [37]

    BadLabel: A Robust Per- spective on Evaluating and Enhancing Label-Noise Learning

    Jingfeng Zhang, Bo Song, Haohan Wang, Bo Han, Tongliang Liu, Lei Liu, and Masashi Sugiyama. BadLabel: A Robust Per- spective on Evaluating and Enhancing Label-Noise Learning. IEEE Trans. Pattern Anal. Mach. Intell., 46(6):4398–4409,

  38. [38]

    Federated Label-Noise Learning with Local Diversity Product Regularization

    Xiaochen Zhou and Xudong Wang. Federated Label-Noise Learning with Local Diversity Product Regularization. In Thirty-Eighth AAAI Conference on Artificial Intelligence, AAAI 2024, Thirty-Sixth Conference on Innovative Applica- tions of Artificial Intelligence, IAAI 2024, Fourteenth Sym- posium on Educational Advances in Artificial Intelligence, EAAI 2014, ...

  39. [39]

    server → client

    The entire hyperparameter configurations FedGR uses are listed in Tab. 7. These hyperparameters are divided into three groups, namely the parameters for FL setups, the opti- mization configurations of the client’s local training, and the specific hyperparameters for the proposed FedGR. Notably, all the baselines adopt the same optimization configurations....