FedBiCross: A Bi-Level Optimization Framework to Tackle Non-IID Challenges in Data-Free One-Shot Federated Learning on Medical Data

Hong-Ning Dai; Yalin Liu; Yinghao Zhang; Yong Xia; Yuexuan Xia

arxiv: 2601.01901 · v2 · pith:GMJLXOOInew · submitted 2026-01-05 · 💻 cs.LG

FedBiCross: A Bi-Level Optimization Framework to Tackle Non-IID Challenges in Data-Free One-Shot Federated Learning on Medical Data

Yuexuan Xia , Yinghao Zhang , Yalin Liu , Hong-Ning Dai , Yong Xia This is my paper

Pith reviewed 2026-05-21 16:41 UTC · model grok-4.3

classification 💻 cs.LG

keywords Federated LearningNon-IID DataOne-Shot LearningData-Free Knowledge DistillationMedical Image AnalysisBi-Level OptimizationPersonalized Models

0 comments

The pith

FedBiCross clusters clients by output similarity and applies bi-level optimization to enable selective knowledge transfer in data-free one-shot federated learning for non-IID medical data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper targets the failure of standard aggregation in data-free one-shot federated learning when client data distributions differ sharply. Averaging predictions across all clients produces nearly uniform soft labels that give weak guidance for distillation. FedBiCross first groups clients into coherent sub-ensembles using similarity of their model outputs. It then runs bi-level cross-cluster optimization to learn weights that pull useful knowledge from other clusters while blocking harmful transfer. A final personalized distillation step adapts the model to each client's local distribution.

Core claim

Under non-IID conditions, global averaging of client predictions in data-free one-shot federated learning cancels out useful signals and yields uninformative supervision. FedBiCross solves this by clustering clients according to output similarity, then using bi-level optimization to compute adaptive cross-cluster weights that transfer only beneficial knowledge, followed by client-specific distillation that produces personalized models.

What carries the argument

Bi-level cross-cluster optimization that learns adaptive weights for selective knowledge transfer between output-similarity clusters.

If this is right

FedBiCross outperforms existing baselines across varying degrees of non-IID data on four medical image datasets.
The method completes training in a single communication round without ever sharing raw patient data.
Personalized models are produced for each client instead of a single global model.
Negative transfer between dissimilar clients is reduced by the learned adaptive weights.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same clustering-plus-bi-level structure might apply to multi-round federated learning where communication cost is still a concern.
The framework could be tested on non-image medical data such as electronic health records to check domain specificity.
If the output-similarity clustering proves stable across random seeds, it could serve as a lightweight pre-processing step in other privacy-preserving training pipelines.

Load-bearing premise

Clustering clients by model output similarity forms coherent sub-ensembles that enable selective beneficial cross-cluster knowledge transfer while suppressing negative transfer.

What would settle it

If experiments on the same four medical image datasets show that uniform averaging of all client predictions produces higher accuracy than the bi-level weighted transfer under identical non-IID partitions, the central claim would be falsified.

Figures

Figures reproduced from arXiv: 2601.01901 by Hong-Ning Dai, Yalin Liu, Yinghao Zhang, Yong Xia, Yuexuan Xia.

**Figure 1.** Figure 1: Overview of FedBiCross. (a) Previous methods (e.g., FedISCA [4]) construct a unified ensemble teacher from all clients, ignoring distribution [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗

**Figure 2.** Figure 2: Ablation on the clustering stage for BloodMNIST. Test accuracy (%) [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: Visualization of synthetic images generated by different methods [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

read the original abstract

Data-free knowledge distillation-based one-shot federated learning (OSFL) trains a model in a single communication round without sharing raw data, making OSFL attractive for privacy-sensitive medical applications. However, existing methods aggregate predictions from all clients to form a global teacher. Under non-IID data, conflicting predictions cancel out during averaging, yielding near-uniform soft labels that provide weak supervision for distillation. We propose FedBiCross, a personalized OSFL framework with three stages: (1) clustering clients by model output similarity to form coherent sub-ensembles, (2) bi-level cross-cluster optimization that learns adaptive weights to selectively leverage beneficial cross-cluster knowledge while suppressing negative transfer, and (3) personalized distillation for client-specific adaptation. Experiments on four medical image datasets demonstrate that FedBiCross consistently outperforms state-of-the-art baselines across different non-IID degrees.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

FedBiCross clusters clients by output similarity then applies bi-level optimization for selective cross-cluster weights in data-free one-shot medical FL, but the clustering step risks grouping shared biases instead of useful knowledge.

read the letter

The punchline on this one is that FedBiCross tries to solve the negative transfer problem in data-free one-shot federated learning by first clustering clients according to how similar their local model outputs are, then using bi-level optimization to learn weights that pull useful knowledge across clusters while avoiding the bad stuff. The paper does a solid job laying out the practical problem. In medical settings, you often can't share raw patient data, and doing everything in one round is appealing to cut down on communication. Existing methods just average all the client predictions to make a teacher, but when the data distributions are non-IID—which they almost always are in real hospitals—the averaged soft labels end up close to uniform and don't give good supervision for the global model. Their approach adds clustering to form more coherent groups and then adapts the weights with bi-level opt, followed by personalized distillation per client. That integration in the medical context is what feels new compared to generic FL or KD papers. They report that it beats baselines on four medical image datasets across different levels of non-IID. If those results include proper controls and show meaningful gains, it could be useful for collaborative training without data movement. The potential weak point is exactly the one in the stress-test note. Clustering on output similarity after local training might just be picking up on common failure modes rather than complementary strengths. Think different scanners or varying disease prevalence across sites; clients could have similar wrong predictions for the same reasons, so the similarity graph ends up connecting the wrong things. Then the bi-level optimization gets a bad starting point and can't reliably do the selective transfer. I'd want to see ablations that test this assumption directly, like checking what the clusters actually correspond to in terms of data characteristics. Overall, this is for people working on federated learning applied to healthcare imaging. A reader who needs methods that work with strict privacy and limited rounds would get some ideas here, provided the empirical claims hold up under scrutiny. I think it deserves to go to peer review so the details on the optimization and the experimental setup can be checked properly.

Referee Report

2 major / 2 minor

Summary. The paper proposes FedBiCross, a data-free one-shot federated learning framework for medical imaging that tackles non-IID data via three stages: (1) clustering clients by model output similarity to form sub-ensembles, (2) bi-level cross-cluster optimization to learn adaptive weights for selective beneficial knowledge transfer while suppressing negative transfer, and (3) personalized distillation. It claims consistent outperformance over state-of-the-art baselines on four medical image datasets across varying non-IID degrees.

Significance. If the empirical claims hold and the clustering produces coherent sub-ensembles, the approach could meaningfully advance privacy-preserving OSFL in medical domains by addressing prediction cancellation under non-IID conditions through selective cross-cluster transfer. The bi-level optimization for adaptive weights represents a targeted mechanism for mitigating negative transfer, which is a common pain point in federated medical imaging.

major comments (2)

[Abstract (stages 1-2) / Method description] The central claim that FedBiCross outperforms baselines rests on stage (1) producing coherent sub-ensembles that enable effective selective transfer in stage (2). However, when client distributions differ by scanner, acquisition protocol, or pathology prevalence, model outputs can correlate due to shared failure modes rather than complementary features; this risks feeding a misleading similarity graph into the bi-level optimization, undermining the suppression of negative transfer. This assumption requires explicit validation (e.g., via ablation on clustering quality or failure-mode analysis) to support the outperformance results.
[Abstract] The abstract states that experiments demonstrate consistent outperformance but provides no quantitative results, error bars, ablation details, dataset statistics, or non-IID degree definitions. Without these, the load-bearing empirical claim cannot be assessed for robustness or reproducibility.

minor comments (2)

[Method] Clarify the exact similarity metric used for clustering (e.g., cosine on logits or KL divergence) and how the bi-level optimization is formulated (inner/outer objectives and variables).
[Experiments] Ensure all baselines are fairly re-implemented under the same one-shot, data-free constraints and report results with standard deviations over multiple runs.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We sincerely thank the referee for the detailed and constructive feedback on our manuscript. We have carefully considered each comment and provide point-by-point responses below. We believe these revisions will strengthen the paper.

read point-by-point responses

Referee: [Abstract (stages 1-2) / Method description] The central claim that FedBiCross outperforms baselines rests on stage (1) producing coherent sub-ensembles that enable effective selective transfer in stage (2). However, when client distributions differ by scanner, acquisition protocol, or pathology prevalence, model outputs can correlate due to shared failure modes rather than complementary features; this risks feeding a misleading similarity graph into the bi-level optimization, undermining the suppression of negative transfer. This assumption requires explicit validation (e.g., via ablation on clustering quality or failure-mode analysis) to support the outperformance results.

Authors: We appreciate the referee highlighting this important potential limitation in the clustering approach. While it is possible for similarities to stem from shared failure modes in medical imaging scenarios with varying scanners or protocols, our bi-level cross-cluster optimization is specifically designed to adaptively weight the knowledge transfer, thereby reducing the influence of negative transfers. To directly address this concern and provide explicit validation, we will add a new ablation study in the revised manuscript. This study will include metrics for clustering quality, such as the average pairwise similarity within clusters versus between clusters, and analyze how clustering affects the suppression of negative transfer in the bi-level optimization. We will also discuss failure modes observed in the experiments. revision: yes
Referee: [Abstract] The abstract states that experiments demonstrate consistent outperformance but provides no quantitative results, error bars, ablation details, dataset statistics, or non-IID degree definitions. Without these, the load-bearing empirical claim cannot be assessed for robustness or reproducibility.

Authors: We agree that including more specific details in the abstract would improve the reader's ability to evaluate the empirical claims. In the revised version, we will modify the abstract to incorporate key quantitative findings, including the average improvement margins over baselines across the datasets, references to standard deviations or error bars from repeated experiments, and concise information on the medical image datasets used along with the definitions of non-IID degrees (e.g., Dirichlet distribution parameters or label skew levels). This will be done while maintaining the abstract's brevity. revision: yes

Circularity Check

0 steps flagged

No significant circularity; framework is procedural with external experimental validation

full rationale

The paper proposes a three-stage FedBiCross framework (clustering by output similarity, bi-level cross-cluster optimization for adaptive weights, and personalized distillation) to address non-IID challenges in data-free one-shot FL. Performance is claimed via direct experiments on four medical image datasets outperforming baselines across non-IID degrees. No derivation chain reduces a claimed prediction or result to its own inputs by construction, self-definition, or load-bearing self-citation. The method introduces new procedural stages rather than renaming or fitting quantities tautologically; claims rest on empirical comparisons against external benchmarks, not internal equivalence.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Relies on domain assumptions about non-IID effects in FL and introduces a procedural framework without explicit free parameters or new physical entities.

axioms (1)

domain assumption Under non-IID data, conflicting client predictions cancel during averaging to produce near-uniform soft labels that provide weak supervision.
Directly stated as the motivating problem in the abstract.

pith-pipeline@v0.9.0 · 5696 in / 1151 out tokens · 51832 ms · 2026-05-21T16:41:45.247681+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Stage 2: Bi-Level Cross-Cluster Optimization ... w_k = argmin ... inner level trains ... outer level evaluates ... adaptive cross-cluster weights
IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

client clustering via model output similarity ... K-means on prediction matrices p_i

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

20 extracted references · 20 canonical work pages · 1 internal anchor

[1]

Communication-efficient learning of deep networks from decentralized data,

Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas, “Communication-efficient learning of deep networks from decentralized data,” inAISTATS, 2017, pp. 1273–1282

work page 2017
[2]

One-Shot Federated Learning

Neel Guha, Ameet Talwalkar, and Virginia Smith, “One-shot federated learning,”arXiv preprint arXiv:1902.11175, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1902
[3]

Dense: Data-free one-shot federated learning,

Jie Zhang, Chen Chen, Bo Li, Lingjuan Lyu, Shuang Wu, Shouhong Ding, Chunhua Shen, and Chao Wu, “Dense: Data-free one-shot federated learning,”NeurIPS, vol. 35, pp. 21414–21428, 2022

work page 2022
[4]

One-shot federated learning on medical data using knowledge distillation with image synthesis and client model adaptation,

Myeongkyun Kang, Philip Chikontwe, Soopil Kim, Kyong Hwan Jin, Ehsan Adeli, Kilian M Pohl, and Sang Hyun Park, “One-shot federated learning on medical data using knowledge distillation with image synthesis and client model adaptation,” inMICCAI, 2023, pp. 521–531

work page 2023
[5]

Enhancing one-shot federated learning through data and ensemble co-boosting,

Rong Dai, Yonggang Zhang, Ang Li, Tongliang Liu, Xun Yang, and Bo Han, “Enhancing one-shot federated learning through data and ensemble co-boosting,” inICLR, 2024

work page 2024
[6]

Dreaming to distill: Data-free knowledge transfer via deepinversion,

Hongxu Yin, Pavlo Molchanov, Jose M Alvarez, Zhizhong Li, Arun Mallya, Derek Hoiem, Niraj K Jha, and Jan Kautz, “Dreaming to distill: Data-free knowledge transfer via deepinversion,” inCVPR, 2020, pp. 8715–8724

work page 2020
[7]

Personalized federated learning with feature alignment and classifier collaboration,

Jian Xu, Xinyi Tong, and Shao-Lun Huang, “Personalized federated learning with feature alignment and classifier collaboration,” inICLR, 2023

work page 2023
[8]

Fedbabu: Towards enhanced representation for federated image classification,

Jaehoon Oh, Sangmook Kim, and Se-Young Yun, “Fedbabu: Towards enhanced representation for federated image classification,” inICLR, 2022

work page 2022
[9]

Personalized federated learning with mixture of models for adaptive prediction and model fine-tuning,

Pouya M Ghari and Yanning Shen, “Personalized federated learning with mixture of models for adaptive prediction and model fine-tuning,” NeurIPS, vol. 37, pp. 92155–92183, 2024

work page 2024
[10]

Federated learning with hierarchical clustering of local updates to improve training on non- iid data,

Christopher Briggs, Zhong Fan, and Peter Andras, “Federated learning with hierarchical clustering of local updates to improve training on non- iid data,” inIJCNN, 2020, pp. 1–9

work page 2020
[11]

An efficient framework for clustered federated learning,

Avishek Ghosh, Jichan Chung, Dong Yin, and Kannan Ramchandran, “An efficient framework for clustered federated learning,”NeurIPS, vol. 33, pp. 19586–19597, 2020

work page 2020
[12]

Efficient distribution simi- larity identification in clustered federated learning via principal angles between client data subspaces,

Saeed Vahidian, Mahdi Morafah, Weijia Wang, Vyacheslav Kungurtsev, Chen Chen, Mubarak Shah, and Bill Lin, “Efficient distribution simi- larity identification in clustered federated learning via principal angles between client data subspaces,” inAAAI, 2023, vol. 37, pp. 10043– 10052

work page 2023
[13]

Fusion learning: A one shot federated learning,

Anirudh Kasturi, Anish Reddy Ellore, and Chittaranjan Hota, “Fusion learning: A one shot federated learning,” inInternational Conference on Computational Science, 2020, pp. 424–436

work page 2020
[14]

Enhancing federated learning by one-shot transferring of intermediate features from clients,

Youxingzhu Deng, Yipeng Zhou, Gang Liu, Jessie Hui Wang, and Yu Shui, “Enhancing federated learning by one-shot transferring of intermediate features from clients,” inDSAA, 2023, pp. 1–11

work page 2023
[15]

Dis- tilled one-shot federated learning,

Yanlin Zhou, George Pu, Xiyao Ma, Xiaolin Li, and Dapeng Wu, “Dis- tilled one-shot federated learning,”arXiv preprint arXiv:2009.07999, 2020

work page arXiv 2009
[16]

Data- free one-shot federated learning under very high statistical heterogene- ity,

Clare Elizabeth Heinbaugh, Emilio Luz-Ricca, and Huajie Shao, “Data- free one-shot federated learning under very high statistical heterogene- ity,” inICLR, 2023

work page 2023
[17]

A new one-shot federated learning framework for medical imaging classification with feature-guided rectified flow and knowledge distillation,

Yufei Ma, Hanwen Zhang, Qiya Yang, Guibo Luo, and Yuesheng Zhu, “A new one-shot federated learning framework for medical imaging classification with feature-guided rectified flow and knowledge distillation,” inECAI, 2025

work page 2025
[18]

Medmnist v2-a large-scale lightweight benchmark for 2d and 3d biomedical image classification,

Jiancheng Yang, Rui Shi, Donglai Wei, Zequan Liu, Lin Zhao, Bilian Ke, Hanspeter Pfister, and Bingbing Ni, “Medmnist v2-a large-scale lightweight benchmark for 2d and 3d biomedical image classification,” Scientific Data, vol. 10, no. 1, pp. 41, 2023

work page 2023
[19]

Data-free learning of student networks,

Hanting Chen, Yunhe Wang, Chang Xu, Zhaohui Yang, Chuanjian Liu, Boxin Shi, Chunjing Xu, Chao Xu, and Qi Tian, “Data-free learning of student networks,” inICCV, 2019, pp. 3514–3522

work page 2019
[20]

Robust fed- erated learning in a heterogeneous environment,

Avishek Ghosh, Justin Hong, Dong Yin, and Kannan Ramchandran, “Robust federated learning in a heterogeneous environment,”arXiv preprint arXiv:1906.06629, 2019

work page arXiv 1906

[1] [1]

Communication-efficient learning of deep networks from decentralized data,

Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas, “Communication-efficient learning of deep networks from decentralized data,” inAISTATS, 2017, pp. 1273–1282

work page 2017

[2] [2]

One-Shot Federated Learning

Neel Guha, Ameet Talwalkar, and Virginia Smith, “One-shot federated learning,”arXiv preprint arXiv:1902.11175, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1902

[3] [3]

Dense: Data-free one-shot federated learning,

Jie Zhang, Chen Chen, Bo Li, Lingjuan Lyu, Shuang Wu, Shouhong Ding, Chunhua Shen, and Chao Wu, “Dense: Data-free one-shot federated learning,”NeurIPS, vol. 35, pp. 21414–21428, 2022

work page 2022

[4] [4]

One-shot federated learning on medical data using knowledge distillation with image synthesis and client model adaptation,

Myeongkyun Kang, Philip Chikontwe, Soopil Kim, Kyong Hwan Jin, Ehsan Adeli, Kilian M Pohl, and Sang Hyun Park, “One-shot federated learning on medical data using knowledge distillation with image synthesis and client model adaptation,” inMICCAI, 2023, pp. 521–531

work page 2023

[5] [5]

Enhancing one-shot federated learning through data and ensemble co-boosting,

Rong Dai, Yonggang Zhang, Ang Li, Tongliang Liu, Xun Yang, and Bo Han, “Enhancing one-shot federated learning through data and ensemble co-boosting,” inICLR, 2024

work page 2024

[6] [6]

Dreaming to distill: Data-free knowledge transfer via deepinversion,

Hongxu Yin, Pavlo Molchanov, Jose M Alvarez, Zhizhong Li, Arun Mallya, Derek Hoiem, Niraj K Jha, and Jan Kautz, “Dreaming to distill: Data-free knowledge transfer via deepinversion,” inCVPR, 2020, pp. 8715–8724

work page 2020

[7] [7]

Personalized federated learning with feature alignment and classifier collaboration,

Jian Xu, Xinyi Tong, and Shao-Lun Huang, “Personalized federated learning with feature alignment and classifier collaboration,” inICLR, 2023

work page 2023

[8] [8]

Fedbabu: Towards enhanced representation for federated image classification,

Jaehoon Oh, Sangmook Kim, and Se-Young Yun, “Fedbabu: Towards enhanced representation for federated image classification,” inICLR, 2022

work page 2022

[9] [9]

Personalized federated learning with mixture of models for adaptive prediction and model fine-tuning,

Pouya M Ghari and Yanning Shen, “Personalized federated learning with mixture of models for adaptive prediction and model fine-tuning,” NeurIPS, vol. 37, pp. 92155–92183, 2024

work page 2024

[10] [10]

Federated learning with hierarchical clustering of local updates to improve training on non- iid data,

Christopher Briggs, Zhong Fan, and Peter Andras, “Federated learning with hierarchical clustering of local updates to improve training on non- iid data,” inIJCNN, 2020, pp. 1–9

work page 2020

[11] [11]

An efficient framework for clustered federated learning,

Avishek Ghosh, Jichan Chung, Dong Yin, and Kannan Ramchandran, “An efficient framework for clustered federated learning,”NeurIPS, vol. 33, pp. 19586–19597, 2020

work page 2020

[12] [12]

Efficient distribution simi- larity identification in clustered federated learning via principal angles between client data subspaces,

Saeed Vahidian, Mahdi Morafah, Weijia Wang, Vyacheslav Kungurtsev, Chen Chen, Mubarak Shah, and Bill Lin, “Efficient distribution simi- larity identification in clustered federated learning via principal angles between client data subspaces,” inAAAI, 2023, vol. 37, pp. 10043– 10052

work page 2023

[13] [13]

Fusion learning: A one shot federated learning,

Anirudh Kasturi, Anish Reddy Ellore, and Chittaranjan Hota, “Fusion learning: A one shot federated learning,” inInternational Conference on Computational Science, 2020, pp. 424–436

work page 2020

[14] [14]

Enhancing federated learning by one-shot transferring of intermediate features from clients,

Youxingzhu Deng, Yipeng Zhou, Gang Liu, Jessie Hui Wang, and Yu Shui, “Enhancing federated learning by one-shot transferring of intermediate features from clients,” inDSAA, 2023, pp. 1–11

work page 2023

[15] [15]

Dis- tilled one-shot federated learning,

Yanlin Zhou, George Pu, Xiyao Ma, Xiaolin Li, and Dapeng Wu, “Dis- tilled one-shot federated learning,”arXiv preprint arXiv:2009.07999, 2020

work page arXiv 2009

[16] [16]

Data- free one-shot federated learning under very high statistical heterogene- ity,

Clare Elizabeth Heinbaugh, Emilio Luz-Ricca, and Huajie Shao, “Data- free one-shot federated learning under very high statistical heterogene- ity,” inICLR, 2023

work page 2023

[17] [17]

A new one-shot federated learning framework for medical imaging classification with feature-guided rectified flow and knowledge distillation,

Yufei Ma, Hanwen Zhang, Qiya Yang, Guibo Luo, and Yuesheng Zhu, “A new one-shot federated learning framework for medical imaging classification with feature-guided rectified flow and knowledge distillation,” inECAI, 2025

work page 2025

[18] [18]

Medmnist v2-a large-scale lightweight benchmark for 2d and 3d biomedical image classification,

Jiancheng Yang, Rui Shi, Donglai Wei, Zequan Liu, Lin Zhao, Bilian Ke, Hanspeter Pfister, and Bingbing Ni, “Medmnist v2-a large-scale lightweight benchmark for 2d and 3d biomedical image classification,” Scientific Data, vol. 10, no. 1, pp. 41, 2023

work page 2023

[19] [19]

Data-free learning of student networks,

Hanting Chen, Yunhe Wang, Chang Xu, Zhaohui Yang, Chuanjian Liu, Boxin Shi, Chunjing Xu, Chao Xu, and Qi Tian, “Data-free learning of student networks,” inICCV, 2019, pp. 3514–3522

work page 2019

[20] [20]

Robust fed- erated learning in a heterogeneous environment,

Avishek Ghosh, Justin Hong, Dong Yin, and Kannan Ramchandran, “Robust federated learning in a heterogeneous environment,”arXiv preprint arXiv:1906.06629, 2019

work page arXiv 1906