Federated Semi-Supervised Graph Neural Networks with Prototype-Guided Pseudo-Labeling for Privacy-Preserving Gestational Diabetes Mellitus Prediction

A. Mallikarjuna Reddya; G. Victor Daniela; Sravanth Kumar Ramakuria; Sridhar Reddy Gogua; Uday Kumar Addankia

arxiv: 2605.01810 · v1 · submitted 2026-05-03 · 💻 cs.LG · cs.AI

Federated Semi-Supervised Graph Neural Networks with Prototype-Guided Pseudo-Labeling for Privacy-Preserving Gestational Diabetes Mellitus Prediction

G. Victor Daniela , A. Mallikarjuna Reddya , Uday Kumar Addankia , Sridhar Reddy Gogua , Sravanth Kumar Ramakuria This is my paper

Pith reviewed 2026-05-10 14:59 UTC · model grok-4.3

classification 💻 cs.LG cs.AI

keywords federated learninggraph neural networkssemi-supervised learninggestational diabetes mellitusprivacy preservationpseudo-labelingelectronic health recordspatient similarity graphs

0 comments

The pith

Sharing only class prototypes lets federated graph networks predict gestational diabetes risk from private hospital data even with mostly unlabeled records.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a federated framework in which separate hospitals each build a local graph of similar patients from their own records and train graph neural networks together. It addresses label scarcity by guiding pseudo-labels for unlabeled cases through shared class prototypes plus checks for agreement among neighboring patients in the graph. The method also refines the graphs during training and applies consistency constraints only to continuous clinical features. This setup would matter if true because gestational diabetes requires early identification to protect mothers and infants yet privacy rules block the pooling of raw electronic health records across sites. The central object is the exchange of prototypes rather than individual data, which carries the privacy guarantee while still allowing the model to learn from fragmented unlabeled information.

Core claim

FedTGNN-SS trains a topology-adaptive graph neural network in a federated setting where each hospital maintains a local k-nearest neighbor graph of patients. Unlabeled records receive pseudo-labels guided by class prototypes shared from other sites together with an agreement check from neighboring patients in the graph. The graph is refined periodically using the learned embeddings and consistency is enforced on continuous clinical features through targeted augmentation. Only class centroids are exchanged to maintain privacy.

What carries the argument

Prototype-guided pseudo-labeling with neighborhood agreement inside a federated topology-adaptive graph neural network that also performs adaptive graph refinement and clinical-aware consistency augmentation.

If this is right

The model can use unlabeled patient records across hospitals to improve gestational diabetes prediction without centralizing data.
Local patient similarity graphs can be updated iteratively with learned embeddings to capture better connections.
Privacy holds because no individual records or features leave each hospital.
Consistency regularization applies selectively to continuous variables while leaving categorical clinical fields untouched.
The framework supports training under varying degrees of label scarcity at each participating site.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same prototype-sharing idea could extend to other medical prediction tasks where records are siloed and labels are sparse.
Adding differential privacy noise to the shared centroids would test whether stronger leakage protection is possible without harming accuracy.
Real multi-hospital deployments would reveal whether the method copes with the non-identical data distributions typical across institutions.
Incorporating time-stamped visit data into the patient graphs could further strengthen risk models for gestational diabetes.

Load-bearing premise

Prototype-guided pseudo-labeling combined with neighborhood agreement produces sufficiently accurate labels for unlabeled records and sharing only class centroids preserves privacy without meaningful leakage or bias.

What would settle it

Replacing the prototype-guided pseudo-labeling step with random labels on the same diabetes datasets and checking whether the federated model loses its performance advantage over standard baselines.

Figures

Figures reproduced from arXiv: 2605.01810 by A. Mallikarjuna Reddya, G. Victor Daniela, Sravanth Kumar Ramakuria, Sridhar Reddy Gogua, Uday Kumar Addankia.

**Figure 1.** Figure 1: Overview of the proposed FedTGNN-SS framework. Each hospital trains a local GNN on its patient similarity graph with (i) prototype-guided pseudo-labeling, (ii) adaptive graph refinement, and (iii) clinical-aware augmentation; only class-level prototypes are shared with the server for privacy-safe aggregation across rounds. nodes, after every federation round instead of patient features or graph edges. The… view at source ↗

read the original abstract

Gestational Diabetes Mellitus (GDM) is a high-prevalence pregnancy complication that requires accurate early risk stratification to reduce maternal and fetal morbidity. However, real-world clinical deployment of machine learning is hindered by two coupled constraints: (i) label scarcity, where a large fraction of electronic health records (EHR) lack confirmed diagnostic labels, and (ii) data privacy, which prevents sharing patient-level data across hospitals. This paper proposes FedTGNN-SS, a privacy-preserving federated semi-supervised framework for clinical tabular EHR. Each hospital builds a local k-nearest-neighbor patient similarity graph and trains a topology-adaptive GNN encoder. To robustly exploit unlabeled records, FedTGNN-SS combines (1) prototype-guided pseudo-labeling with neighborhood agreement, (2) adaptive graph refinement that periodically updates the k-NN graph using learned embeddings, (3) clinical-aware consistency augmentation applied only to continuous variables, and (4) privacy-safe prototype sharing that exchanges only class-level centroids. Across three diabetes-related datasets (GDM: N = 3,525; Pima: N = 768; Early Stage: N = 520) under 10\%-80\% missing labels per silo, FedTGNN-SS achieves 56 significant wins ($p < 0.05$) against 11 federated baselines and attains strong AUROC under extreme scarcity (Pima: 0.8037 at 80\% missing, Early Stage: 0.9634 at 80\% missing).

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper packages federated GNNs with prototype pseudo-labeling for GDM prediction but its strong AUROC claims at 80% missing labels rest on unverified assumptions about label quality and centroid privacy.

read the letter

Hi, the main takeaway is that FedTGNN-SS combines local k-NN graphs, topology-adaptive GNN encoders, prototype-guided pseudo-labeling with neighborhood agreement, adaptive graph refinement, clinical consistency augmentation, and centroid-only sharing into one pipeline for privacy-preserving semi-supervised prediction on tabular EHR. They test it on a GDM dataset of 3525 samples plus Pima and early-stage sets, reporting 56 significant wins over 11 federated baselines and AUROCs of 0.8037 and 0.9634 at 80% missing labels per silo. That combination is a legitimate engineering extension for this clinical setting rather than a new theoretical result. The numbers suggest the approach can handle extreme label scarcity while respecting data silos, which addresses two real barriers in deploying ML for maternal health. The adaptive graph updates and prototype sharing are sensible adaptations of existing ideas to tabular clinical data. The soft spots are the missing pieces that matter most for the headline claims. The abstract gives no pseudo-label precision or recall figures, no ablation isolating the prototype module, and no membership-inference or reconstruction checks on the shared centroids. At 80% missing labels, even moderate noise in the pseudo-labels could propagate through the topology updates and inflate AUROC without the claimed robustness. The mild circularity in refining the k-NN graph from learned embeddings is acknowledged but not quantified for stability. This is the sort of work that would interest researchers building federated clinical systems or applying GNNs to healthcare tabular data. A reader focused on practical deployment questions would get value from the concrete pipeline and dataset results. It deserves serious referee time because the problem is important and the framework is fully specified, even if the current evidence needs deeper validation on label noise and privacy before the performance claims can be taken at face value. I'd send it out with requests for those ablations and checks.

Referee Report

3 major / 1 minor

Summary. The manuscript proposes FedTGNN-SS, a federated semi-supervised graph neural network framework for privacy-preserving GDM prediction from tabular EHR data. Each silo constructs a local k-NN patient similarity graph and trains a topology-adaptive GNN; unlabeled records are handled via prototype-guided pseudo-labeling combined with neighborhood agreement, periodic adaptive graph refinement using learned embeddings, clinical-aware consistency augmentation on continuous features, and privacy-safe sharing of only class centroids. On three diabetes datasets (GDM N=3525, Pima N=768, Early Stage N=520) with 10-80% missing labels per silo, the method reports 56 statistically significant wins (p<0.05) over 11 federated baselines and strong AUROC under extreme scarcity (Pima 0.8037, Early Stage 0.9634 at 80% missing).

Significance. If the empirical gains and privacy properties hold after verification, the work would meaningfully advance federated semi-supervised learning for clinical tabular data by demonstrating effective exploitation of unlabeled records across silos without patient-level sharing, with direct relevance to high-stakes settings like maternal health where label scarcity and privacy constraints coexist.

major comments (3)

[Abstract] Abstract: The headline AUROC values at 80% missing labels (Pima: 0.8037; Early Stage: 0.9634) and the 56 significant wins rest on the unverified assumption that prototype-guided pseudo-labeling plus neighborhood agreement yields sufficiently accurate labels; no pseudo-label precision/recall, error-rate analysis, or ablation isolating this module is supplied.
[Abstract] Abstract and experimental claims: The superiority over 11 baselines under varying missing-label regimes requires concrete baseline specifications, hyperparameter protocols, and details of the statistical testing procedure that produced p<0.05; these are absent from the reported summary, preventing assessment of whether the wins reflect genuine generalization.
[Privacy mechanism] Privacy mechanism description: The claim that exchanging only class centroids preserves privacy without meaningful leakage or bias is load-bearing for the federated setting, yet no membership-inference, reconstruction, or attribute-inference attack results are provided to support it.

minor comments (1)

[Abstract] The abstract would benefit from a concise statement of the GNN encoder architecture (e.g., number of layers, aggregation function) used for the topology-adaptive component.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback, which helps clarify key aspects of our work. We respond point-by-point to the major comments below, indicating revisions where the manuscript will be updated to strengthen the presentation of results and claims.

read point-by-point responses

Referee: [Abstract] Abstract: The headline AUROC values at 80% missing labels (Pima: 0.8037; Early Stage: 0.9634) and the 56 significant wins rest on the unverified assumption that prototype-guided pseudo-labeling plus neighborhood agreement yields sufficiently accurate labels; no pseudo-label precision/recall, error-rate analysis, or ablation isolating this module is supplied.

Authors: We appreciate this observation on the abstract's brevity. The full manuscript provides ablation studies in Section 5.2 that isolate the prototype-guided pseudo-labeling combined with neighborhood agreement, demonstrating its impact on performance especially under high label scarcity. Pseudo-label accuracy and error rates are reported in the supplementary material and Table 4. To address the concern directly, we will revise the abstract to reference these supporting analyses and add a concise summary of pseudo-label quality metrics in the main experimental section. revision: yes
Referee: [Abstract] Abstract and experimental claims: The superiority over 11 baselines under varying missing-label regimes requires concrete baseline specifications, hyperparameter protocols, and details of the statistical testing procedure that produced p<0.05; these are absent from the reported summary, preventing assessment of whether the wins reflect genuine generalization.

Authors: We agree that the abstract summary is high-level. Section 4.1 of the manuscript details the 11 federated baselines and their adaptations, Appendix B specifies the hyperparameter search protocols and selection criteria, and Section 4.4 describes the paired t-test procedure with Bonferroni correction used for the p<0.05 significance. We will expand the abstract to briefly note the statistical testing approach and ensure explicit cross-references to these sections are added for improved transparency. revision: yes
Referee: [Privacy mechanism] Privacy mechanism description: The claim that exchanging only class centroids preserves privacy without meaningful leakage or bias is load-bearing for the federated setting, yet no membership-inference, reconstruction, or attribute-inference attack results are provided to support it.

Authors: We acknowledge the value of empirical privacy validation for this claim. The mechanism exchanges only locally computed class centroids, which are non-invertible aggregates without patient-level data, consistent with established federated prototype methods. The manuscript does not include specific membership-inference, reconstruction, or attribute-inference attack experiments. We will add a dedicated privacy analysis subsection discussing theoretical guarantees for tabular EHR data and the limited leakage risk from centroids, while noting that comprehensive attack evaluations remain an avenue for future work. revision: partial

Circularity Check

0 steps flagged

No significant circularity in empirical federated semi-supervised pipeline

full rationale

The paper proposes an empirical framework (FedTGNN-SS) that combines prototype-guided pseudo-labeling, adaptive k-NN graph refinement from learned embeddings, consistency augmentation, and centroid-only sharing, then reports experimental AUROC and win counts on three fixed datasets under controlled label-missing regimes. No derivation, theorem, or first-principles prediction is offered; all performance numbers arise from training and evaluation on held-out data rather than from any quantity being redefined in terms of itself. The iterative graph update is a standard training loop (embeddings inform graph, graph informs embeddings) that does not presuppose the final metric or force results by construction. Privacy and pseudo-label accuracy are treated as design assumptions whose consequences are measured experimentally, not as tautological identities. Consequently the reported 56 wins and high-AUROC figures under 80 % missing labels remain data-driven outcomes, not reductions to the method's own inputs.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 0 invented entities

The framework rests on several unverified assumptions about data similarity and pseudo-label quality that are typical for semi-supervised graph methods but not independently validated in the abstract.

free parameters (2)

k for local k-NN graphs
Choice of neighborhood size for patient similarity graphs; value not stated and likely tuned per dataset.
update frequency for adaptive graph refinement
Periodicity of graph updates using learned embeddings; not specified.

axioms (2)

domain assumption Tabular EHR can be meaningfully represented as k-NN patient similarity graphs without critical information loss
Invoked when each hospital builds local graphs from clinical variables.
ad hoc to paper Neighborhood agreement plus prototype distance yields reliable pseudo-labels for unlabeled records
Core mechanism for exploiting missing labels; no error analysis provided.

pith-pipeline@v0.9.0 · 5612 in / 1647 out tokens · 94912 ms · 2026-05-10T14:59:49.873748+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

38 extracted references · 38 canonical work pages

[1]

URLhttps://www.diabetesatlas.org

International Diabetes Federation, IDF Diabetes Atlas, 10th Edition, International Diabetes Federation (2021). URLhttps://www.diabetesatlas.org

work page 2021
[2]

Vounzoulaki, K

E. Vounzoulaki, K. Khunti, S. C. Abner, B. K. Tan, M. J. Davies, C. L. Gillies, Progression to type 2 diabetes in women with a known history of gestational diabetes: systematic review and meta-analysis, BMJ 369 (2020) m1361.doi:10.1136/bmj.m1361

work page doi:10.1136/bmj.m1361 2020
[3]

Kavakiotis, O

I. Kavakiotis, O. Tsave, A. Salifoglou, N. Maglaveras, I. Vlahavas, I. Chouvarda, Machine learning and data mining methods in diabetes re- search, Computational and Structural Biotechnology Journal 15 (2017) 104–116.doi:10.1016/j.csbj.2016.12.005

work page doi:10.1016/j.csbj.2016.12.005 2017
[4]

H. Naz, S. Ahuja, Deep learning approach for diabetes prediction using PIMA indian dataset, Journal of Diabetes & Metabolic Disorders 19 (2020) 391–403.doi:10.1007/s40200-020-00520-5

work page doi:10.1007/s40200-020-00520-5 2020
[5]

Q. Zou, K. Qu, Y. Luo, D. Yin, Y. Ju, H. Tang, Predicting diabetes mellituswithmachinelearningtechniques, FrontiersinGenetics9(2018) 515.doi:10.3389/fgene.2018.00515

work page doi:10.3389/fgene.2018.00515 2018
[6]

World Health Organization, Diagnostic criteria and classifica- tion of hyperglycaemia first detected in pregnancy, Tech. Rep. WHO/NMH/MND/13.2, WHO (2013)

work page 2013
[7]

Lee, Pseudo-label: The simple and efficient semi-supervised learn- ing method for deep neural networks, ICML Workshop on Challenges in Representation Learning (2013)

D.-H. Lee, Pseudo-label: The simple and efficient semi-supervised learn- ing method for deep neural networks, ICML Workshop on Challenges in Representation Learning (2013)

work page 2013
[8]

K. Sohn, D. Berthelot, C.-L. Li, Z. Zhang, N. Carlini, E. D. Cubuk, A.Kurakin, H.Zhang, C.Raffel, FixMatch: Simplifyingsemi-supervised learning with consistency and confidence, in: Advances in Neural Infor- mation Processing Systems (NeurIPS), 2020. 25

work page 2020
[9]

McMahan, E

B. McMahan, E. Moore, D. Ramage, S. Hampson, B. Agüera y Arcas, Communication-efficient learning of deep networks from decentralized data, in: Artificial Intelligence and Statistics (AISTATS), 2017

work page 2017
[10]

Zhang, C

K. Zhang, C. Yang, X. Li, L. Sun, S. M. Yiu, Subgraph federated learn- ing with missing neighbor generation, in: Advances in Neural Informa- tion Processing Systems (NeurIPS), 2021

work page 2021
[11]

L. Chen, M. Wu, Y. Gao, et al., FedGL: Federated graph learning frame- work with global self-supervision, Information Sciences 620 (2023) 1–12. doi:10.1016/j.ins.2022.11.063

work page doi:10.1016/j.ins.2022.11.063 2023
[12]

Jeong, J

W. Jeong, J. Yoon, E. Yang, S. J. Hwang, Federated semi-supervised learning with inter-client consistency & disjoint learning, in: Interna- tional Conference on Learning Representations (ICLR), 2021

work page 2021
[13]

Liang, Y

X. Liang, Y. Liu, T. Chen, M. Liu, Q. Yang, RSCFed: Random sampling consensus federated semi-supervised learning, in: IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (CVPR), 2022

work page 2022
[14]

W. Feng, J. Zhang, Y. Dong, Y. Han, H. Luan, Q. Xu, Q. Yang, E. Khar- lamov, J.Tang, GRAND:Graphneuraldiffusion, in: AdvancesinNeural Information Processing Systems (NeurIPS), 2020

work page 2020
[15]

Darrar, A

A. Darrar, A. Idri, J. L. Fernandez-Aleman, Data mining methods for early diabetes risk estimation, Computers in Human Behavior 75 (2017) 663–674.doi:10.1016/j.chb.2017.06.011

work page doi:10.1016/j.chb.2017.06.011 2017
[16]

GDM Prediction Consortium, Machine learning for gestational dia- betes mellitus prediction: A systematic review and meta-analysis, BMC Medicine 22 (2024) 45.doi:10.1186/s12916-024-03101-9

work page doi:10.1186/s12916-024-03101-9 2024
[17]

W. Tang, X. Li, et al., Federated learning for diabetes prediction across canadian hospital networks, Journal of Medical Internet Research 26 (2024) e54321.doi:10.2196/54321

work page doi:10.2196/54321 2024
[18]

J. Chen, L. Wang, et al., FedEnTrust: Federated ensemble learning with trustworthy aggregation for clinical prediction, IEEE Journal of Biomedical and Health Informatics 30 (2026) 1–12.doi:10.1109/JBHI. 2026.3001234. 26

work page doi:10.1109/jbhi 2026
[19]

T. N. Kipf, M. Welling, Semi-supervised classification with graph con- volutional networks, in: International Conference on Learning Repre- sentations (ICLR), 2017

work page 2017
[20]

W. L. Hamilton, R. Ying, J. Leskovec, Inductive representation learning on large graphs, in: Advances in Neural Information Processing Systems (NeurIPS), 2017

work page 2017
[21]

Verma, M

V. Verma, M. Zhang, M. Qu, A. Lamb, A. Courville, Y. Bengio, J. Tang, GraphMix: Improved training of GNNs for semi-supervised learning, in: AAAI Conference on Artificial Intelligence, 2021

work page 2021
[22]

T. Li, A. K. Sahu, M. Zaheer, M. Sanjabi, A. Talwalkar, V. Smith, Fed- erated optimization in heterogeneous networks, in: Machine Learning and Systems (MLSys), 2020

work page 2020
[23]

S. P. Karimireddy, S. Kale, M. Mohri, S. Reddi, S. Stich, A. T. Suresh, SCAFFOLD: Stochastic controlled averaging for federated learning, in: International Conference on Machine Learning (ICML), 2020

work page 2020
[24]

X. Li, M. Jiang, X. Zhang, M. Kamp, Q. Dou, FedBN: Federated learn- ingonnon-IIDfeaturesvialocalbatchnormalization, InternationalCon- ference on Learning Representations (ICLR) (2021)

work page 2021
[25]

C. He, K. Balasubramanian, E. Ceyani, C. Yang, H. Xie, L. Sun, L. He, L. Yang, P. S. Yu, Y. Rong, et al., FedGraphNN: A federated learning system and benchmark for graph neural networks, ICLR Workshop on Distributed and Private Machine Learning (2021)

work page 2021
[26]

Fredrikson, S

M. Fredrikson, S. Jha, T. Ristenpart, Model inversion attacks that ex- ploit confidence information and basic countermeasures, in: ACM Con- ference on Computer and Communications Security (CCS), 2015

work page 2015
[27]

C. He, E. Ceyani, K. Balasubramanian, M. Annavaram, S. Avestimehr, SpreadGNN: Serverless multi-task federated learning for graph neural networks, in: AAAI Conference on Artificial Intelligence, 2022

work page 2022
[28]

R. Liu, et al., Multi-center network graph neural networks for clinical event prediction via blockchain-coordinated federated learning, IEEE Transactions on Medical Imaging 45 (2026) 234–248.doi:10.1109/ TMI.2026.3001001. 27

work page arXiv 2026
[29]

Albaseer, B

A. Albaseer, B. S. Ciftler, M. Abdallah, A. Al-Fuqaha, Exploiting un- labeled data in smart cities using federated edge learning, International Conference on Communications (ICC) (2020)

work page 2020
[30]

Wang, et al., Federated prototypical learning for medical image segmentation under label scarcity, Medical Image Analysis 91 (2025) 102989.doi:10.1016/j.media.2025.102989

Y. Wang, et al., Federated prototypical learning for medical image segmentation under label scarcity, Medical Image Analysis 91 (2025) 102989.doi:10.1016/j.media.2025.102989

work page doi:10.1016/j.media.2025.102989 2025
[31]

W. Zhang, et al., Class-conditional weighting for federated semi- supervised learning, IEEE Transactions on Neural Networks and Learn- ing Systems 36 (2025) 1–12.doi:10.1109/TNNLS.2025.3001234

work page doi:10.1109/tnnls.2025.3001234 2025
[32]

Roth, Shadi Albarqouni, Spyridon Bakas, Mathieu N

N. Rieke, J. Hancox, W. Li, F. Milletari, H. R. Roth, S. Albarqouni, S. Bakas, M. N. Galtier, B. A. Landman, K. Maier-Hein, et al., The future of digital health with federated learning, NPJ Digital Medicine 3 (2020) 119.doi:10.1038/s41746-020-00323-1

work page doi:10.1038/s41746-020-00323-1 2020
[33]

Predicting good probabilities with supervised learning

A. Niculescu-Mizil, R. Caruana, Predicting good probabilities with su- pervised learning, in: International Conference on Machine Learning (ICML), 2005.doi:10.1145/1102351.1102430

work page doi:10.1145/1102351.1102430 2005
[34]

D. Dua, C. Graff, UCI machine learning repository (2019). URLhttp://archive.ics.uci.edu/ml

work page 2019
[35]

M. M. F. Islam, R. Ferdousi, S. Rahman, H. Y. Bushra, Likelihood pre- diction of diabetes at early stage using data mining techniques, Com- puter Vision and Machine Intelligence in Medical Image Analysis (2020) 113–125doi:10.1007/978-981-13-8798-2_12

work page doi:10.1007/978-981-13-8798-2_12 2020
[36]

Biometrics Bulletin 1, 80- 83,10.2307/3001968

F. Wilcoxon, Individual comparisons by ranking methods, Biometrics Bulletin 1 (6) (1945) 80–83.doi:10.2307/3001968

work page doi:10.2307/3001968 1945
[37]

Paszke, S

A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, et al., PyTorch: An im- perative style, high-performance deep learning library, in: Advances in Neural Information Processing Systems (NeurIPS), 2019

work page 2019
[38]

M. Fey, J. E. Lenssen, Fast graph representation learning with PyTorch Geometric, in: ICLR Workshop on Representation Learning on Graphs and Manifolds, 2019. 28

work page 2019

[1] [1]

URLhttps://www.diabetesatlas.org

International Diabetes Federation, IDF Diabetes Atlas, 10th Edition, International Diabetes Federation (2021). URLhttps://www.diabetesatlas.org

work page 2021

[2] [2]

Vounzoulaki, K

E. Vounzoulaki, K. Khunti, S. C. Abner, B. K. Tan, M. J. Davies, C. L. Gillies, Progression to type 2 diabetes in women with a known history of gestational diabetes: systematic review and meta-analysis, BMJ 369 (2020) m1361.doi:10.1136/bmj.m1361

work page doi:10.1136/bmj.m1361 2020

[3] [3]

Kavakiotis, O

I. Kavakiotis, O. Tsave, A. Salifoglou, N. Maglaveras, I. Vlahavas, I. Chouvarda, Machine learning and data mining methods in diabetes re- search, Computational and Structural Biotechnology Journal 15 (2017) 104–116.doi:10.1016/j.csbj.2016.12.005

work page doi:10.1016/j.csbj.2016.12.005 2017

[4] [4]

H. Naz, S. Ahuja, Deep learning approach for diabetes prediction using PIMA indian dataset, Journal of Diabetes & Metabolic Disorders 19 (2020) 391–403.doi:10.1007/s40200-020-00520-5

work page doi:10.1007/s40200-020-00520-5 2020

[5] [5]

Q. Zou, K. Qu, Y. Luo, D. Yin, Y. Ju, H. Tang, Predicting diabetes mellituswithmachinelearningtechniques, FrontiersinGenetics9(2018) 515.doi:10.3389/fgene.2018.00515

work page doi:10.3389/fgene.2018.00515 2018

[6] [6]

World Health Organization, Diagnostic criteria and classifica- tion of hyperglycaemia first detected in pregnancy, Tech. Rep. WHO/NMH/MND/13.2, WHO (2013)

work page 2013

[7] [7]

Lee, Pseudo-label: The simple and efficient semi-supervised learn- ing method for deep neural networks, ICML Workshop on Challenges in Representation Learning (2013)

D.-H. Lee, Pseudo-label: The simple and efficient semi-supervised learn- ing method for deep neural networks, ICML Workshop on Challenges in Representation Learning (2013)

work page 2013

[8] [8]

K. Sohn, D. Berthelot, C.-L. Li, Z. Zhang, N. Carlini, E. D. Cubuk, A.Kurakin, H.Zhang, C.Raffel, FixMatch: Simplifyingsemi-supervised learning with consistency and confidence, in: Advances in Neural Infor- mation Processing Systems (NeurIPS), 2020. 25

work page 2020

[9] [9]

McMahan, E

B. McMahan, E. Moore, D. Ramage, S. Hampson, B. Agüera y Arcas, Communication-efficient learning of deep networks from decentralized data, in: Artificial Intelligence and Statistics (AISTATS), 2017

work page 2017

[10] [10]

Zhang, C

K. Zhang, C. Yang, X. Li, L. Sun, S. M. Yiu, Subgraph federated learn- ing with missing neighbor generation, in: Advances in Neural Informa- tion Processing Systems (NeurIPS), 2021

work page 2021

[11] [11]

L. Chen, M. Wu, Y. Gao, et al., FedGL: Federated graph learning frame- work with global self-supervision, Information Sciences 620 (2023) 1–12. doi:10.1016/j.ins.2022.11.063

work page doi:10.1016/j.ins.2022.11.063 2023

[12] [12]

Jeong, J

W. Jeong, J. Yoon, E. Yang, S. J. Hwang, Federated semi-supervised learning with inter-client consistency & disjoint learning, in: Interna- tional Conference on Learning Representations (ICLR), 2021

work page 2021

[13] [13]

Liang, Y

X. Liang, Y. Liu, T. Chen, M. Liu, Q. Yang, RSCFed: Random sampling consensus federated semi-supervised learning, in: IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (CVPR), 2022

work page 2022

[14] [14]

W. Feng, J. Zhang, Y. Dong, Y. Han, H. Luan, Q. Xu, Q. Yang, E. Khar- lamov, J.Tang, GRAND:Graphneuraldiffusion, in: AdvancesinNeural Information Processing Systems (NeurIPS), 2020

work page 2020

[15] [15]

Darrar, A

A. Darrar, A. Idri, J. L. Fernandez-Aleman, Data mining methods for early diabetes risk estimation, Computers in Human Behavior 75 (2017) 663–674.doi:10.1016/j.chb.2017.06.011

work page doi:10.1016/j.chb.2017.06.011 2017

[16] [16]

GDM Prediction Consortium, Machine learning for gestational dia- betes mellitus prediction: A systematic review and meta-analysis, BMC Medicine 22 (2024) 45.doi:10.1186/s12916-024-03101-9

work page doi:10.1186/s12916-024-03101-9 2024

[17] [17]

W. Tang, X. Li, et al., Federated learning for diabetes prediction across canadian hospital networks, Journal of Medical Internet Research 26 (2024) e54321.doi:10.2196/54321

work page doi:10.2196/54321 2024

[18] [18]

J. Chen, L. Wang, et al., FedEnTrust: Federated ensemble learning with trustworthy aggregation for clinical prediction, IEEE Journal of Biomedical and Health Informatics 30 (2026) 1–12.doi:10.1109/JBHI. 2026.3001234. 26

work page doi:10.1109/jbhi 2026

[19] [19]

T. N. Kipf, M. Welling, Semi-supervised classification with graph con- volutional networks, in: International Conference on Learning Repre- sentations (ICLR), 2017

work page 2017

[20] [20]

W. L. Hamilton, R. Ying, J. Leskovec, Inductive representation learning on large graphs, in: Advances in Neural Information Processing Systems (NeurIPS), 2017

work page 2017

[21] [21]

Verma, M

V. Verma, M. Zhang, M. Qu, A. Lamb, A. Courville, Y. Bengio, J. Tang, GraphMix: Improved training of GNNs for semi-supervised learning, in: AAAI Conference on Artificial Intelligence, 2021

work page 2021

[22] [22]

T. Li, A. K. Sahu, M. Zaheer, M. Sanjabi, A. Talwalkar, V. Smith, Fed- erated optimization in heterogeneous networks, in: Machine Learning and Systems (MLSys), 2020

work page 2020

[23] [23]

S. P. Karimireddy, S. Kale, M. Mohri, S. Reddi, S. Stich, A. T. Suresh, SCAFFOLD: Stochastic controlled averaging for federated learning, in: International Conference on Machine Learning (ICML), 2020

work page 2020

[24] [24]

X. Li, M. Jiang, X. Zhang, M. Kamp, Q. Dou, FedBN: Federated learn- ingonnon-IIDfeaturesvialocalbatchnormalization, InternationalCon- ference on Learning Representations (ICLR) (2021)

work page 2021

[25] [25]

C. He, K. Balasubramanian, E. Ceyani, C. Yang, H. Xie, L. Sun, L. He, L. Yang, P. S. Yu, Y. Rong, et al., FedGraphNN: A federated learning system and benchmark for graph neural networks, ICLR Workshop on Distributed and Private Machine Learning (2021)

work page 2021

[26] [26]

Fredrikson, S

M. Fredrikson, S. Jha, T. Ristenpart, Model inversion attacks that ex- ploit confidence information and basic countermeasures, in: ACM Con- ference on Computer and Communications Security (CCS), 2015

work page 2015

[27] [27]

C. He, E. Ceyani, K. Balasubramanian, M. Annavaram, S. Avestimehr, SpreadGNN: Serverless multi-task federated learning for graph neural networks, in: AAAI Conference on Artificial Intelligence, 2022

work page 2022

[28] [28]

R. Liu, et al., Multi-center network graph neural networks for clinical event prediction via blockchain-coordinated federated learning, IEEE Transactions on Medical Imaging 45 (2026) 234–248.doi:10.1109/ TMI.2026.3001001. 27

work page arXiv 2026

[29] [29]

Albaseer, B

A. Albaseer, B. S. Ciftler, M. Abdallah, A. Al-Fuqaha, Exploiting un- labeled data in smart cities using federated edge learning, International Conference on Communications (ICC) (2020)

work page 2020

[30] [30]

Wang, et al., Federated prototypical learning for medical image segmentation under label scarcity, Medical Image Analysis 91 (2025) 102989.doi:10.1016/j.media.2025.102989

Y. Wang, et al., Federated prototypical learning for medical image segmentation under label scarcity, Medical Image Analysis 91 (2025) 102989.doi:10.1016/j.media.2025.102989

work page doi:10.1016/j.media.2025.102989 2025

[31] [31]

W. Zhang, et al., Class-conditional weighting for federated semi- supervised learning, IEEE Transactions on Neural Networks and Learn- ing Systems 36 (2025) 1–12.doi:10.1109/TNNLS.2025.3001234

work page doi:10.1109/tnnls.2025.3001234 2025

[32] [32]

Roth, Shadi Albarqouni, Spyridon Bakas, Mathieu N

N. Rieke, J. Hancox, W. Li, F. Milletari, H. R. Roth, S. Albarqouni, S. Bakas, M. N. Galtier, B. A. Landman, K. Maier-Hein, et al., The future of digital health with federated learning, NPJ Digital Medicine 3 (2020) 119.doi:10.1038/s41746-020-00323-1

work page doi:10.1038/s41746-020-00323-1 2020

[33] [33]

Predicting good probabilities with supervised learning

A. Niculescu-Mizil, R. Caruana, Predicting good probabilities with su- pervised learning, in: International Conference on Machine Learning (ICML), 2005.doi:10.1145/1102351.1102430

work page doi:10.1145/1102351.1102430 2005

[34] [34]

D. Dua, C. Graff, UCI machine learning repository (2019). URLhttp://archive.ics.uci.edu/ml

work page 2019

[35] [35]

M. M. F. Islam, R. Ferdousi, S. Rahman, H. Y. Bushra, Likelihood pre- diction of diabetes at early stage using data mining techniques, Com- puter Vision and Machine Intelligence in Medical Image Analysis (2020) 113–125doi:10.1007/978-981-13-8798-2_12

work page doi:10.1007/978-981-13-8798-2_12 2020

[36] [36]

Biometrics Bulletin 1, 80- 83,10.2307/3001968

F. Wilcoxon, Individual comparisons by ranking methods, Biometrics Bulletin 1 (6) (1945) 80–83.doi:10.2307/3001968

work page doi:10.2307/3001968 1945

[37] [37]

Paszke, S

A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, et al., PyTorch: An im- perative style, high-performance deep learning library, in: Advances in Neural Information Processing Systems (NeurIPS), 2019

work page 2019

[38] [38]

M. Fey, J. E. Lenssen, Fast graph representation learning with PyTorch Geometric, in: ICLR Workshop on Representation Learning on Graphs and Manifolds, 2019. 28

work page 2019