pith. sign in

arxiv: 2605.01810 · v1 · submitted 2026-05-03 · 💻 cs.LG · cs.AI

Federated Semi-Supervised Graph Neural Networks with Prototype-Guided Pseudo-Labeling for Privacy-Preserving Gestational Diabetes Mellitus Prediction

Pith reviewed 2026-05-10 14:59 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords federated learninggraph neural networkssemi-supervised learninggestational diabetes mellitusprivacy preservationpseudo-labelingelectronic health recordspatient similarity graphs
0
0 comments X

The pith

Sharing only class prototypes lets federated graph networks predict gestational diabetes risk from private hospital data even with mostly unlabeled records.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a federated framework in which separate hospitals each build a local graph of similar patients from their own records and train graph neural networks together. It addresses label scarcity by guiding pseudo-labels for unlabeled cases through shared class prototypes plus checks for agreement among neighboring patients in the graph. The method also refines the graphs during training and applies consistency constraints only to continuous clinical features. This setup would matter if true because gestational diabetes requires early identification to protect mothers and infants yet privacy rules block the pooling of raw electronic health records across sites. The central object is the exchange of prototypes rather than individual data, which carries the privacy guarantee while still allowing the model to learn from fragmented unlabeled information.

Core claim

FedTGNN-SS trains a topology-adaptive graph neural network in a federated setting where each hospital maintains a local k-nearest neighbor graph of patients. Unlabeled records receive pseudo-labels guided by class prototypes shared from other sites together with an agreement check from neighboring patients in the graph. The graph is refined periodically using the learned embeddings and consistency is enforced on continuous clinical features through targeted augmentation. Only class centroids are exchanged to maintain privacy.

What carries the argument

Prototype-guided pseudo-labeling with neighborhood agreement inside a federated topology-adaptive graph neural network that also performs adaptive graph refinement and clinical-aware consistency augmentation.

If this is right

  • The model can use unlabeled patient records across hospitals to improve gestational diabetes prediction without centralizing data.
  • Local patient similarity graphs can be updated iteratively with learned embeddings to capture better connections.
  • Privacy holds because no individual records or features leave each hospital.
  • Consistency regularization applies selectively to continuous variables while leaving categorical clinical fields untouched.
  • The framework supports training under varying degrees of label scarcity at each participating site.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same prototype-sharing idea could extend to other medical prediction tasks where records are siloed and labels are sparse.
  • Adding differential privacy noise to the shared centroids would test whether stronger leakage protection is possible without harming accuracy.
  • Real multi-hospital deployments would reveal whether the method copes with the non-identical data distributions typical across institutions.
  • Incorporating time-stamped visit data into the patient graphs could further strengthen risk models for gestational diabetes.

Load-bearing premise

Prototype-guided pseudo-labeling combined with neighborhood agreement produces sufficiently accurate labels for unlabeled records and sharing only class centroids preserves privacy without meaningful leakage or bias.

What would settle it

Replacing the prototype-guided pseudo-labeling step with random labels on the same diabetes datasets and checking whether the federated model loses its performance advantage over standard baselines.

Figures

Figures reproduced from arXiv: 2605.01810 by A. Mallikarjuna Reddya, G. Victor Daniela, Sravanth Kumar Ramakuria, Sridhar Reddy Gogua, Uday Kumar Addankia.

Figure 1
Figure 1. Figure 1: Overview of the proposed FedTGNN-SS framework. Each hospital trains a local GNN on its patient similarity graph with (i) prototype-guided pseudo-labeling, (ii) adaptive graph refinement, and (iii) clinical-aware augmentation; only class-level proto￾types are shared with the server for privacy-safe aggregation across rounds. nodes, after every federation round instead of patient features or graph edges. The… view at source ↗
read the original abstract

Gestational Diabetes Mellitus (GDM) is a high-prevalence pregnancy complication that requires accurate early risk stratification to reduce maternal and fetal morbidity. However, real-world clinical deployment of machine learning is hindered by two coupled constraints: (i) label scarcity, where a large fraction of electronic health records (EHR) lack confirmed diagnostic labels, and (ii) data privacy, which prevents sharing patient-level data across hospitals. This paper proposes FedTGNN-SS, a privacy-preserving federated semi-supervised framework for clinical tabular EHR. Each hospital builds a local k-nearest-neighbor patient similarity graph and trains a topology-adaptive GNN encoder. To robustly exploit unlabeled records, FedTGNN-SS combines (1) prototype-guided pseudo-labeling with neighborhood agreement, (2) adaptive graph refinement that periodically updates the k-NN graph using learned embeddings, (3) clinical-aware consistency augmentation applied only to continuous variables, and (4) privacy-safe prototype sharing that exchanges only class-level centroids. Across three diabetes-related datasets (GDM: N = 3,525; Pima: N = 768; Early Stage: N = 520) under 10\%-80\% missing labels per silo, FedTGNN-SS achieves 56 significant wins ($p < 0.05$) against 11 federated baselines and attains strong AUROC under extreme scarcity (Pima: 0.8037 at 80\% missing, Early Stage: 0.9634 at 80\% missing).

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 1 minor

Summary. The manuscript proposes FedTGNN-SS, a federated semi-supervised graph neural network framework for privacy-preserving GDM prediction from tabular EHR data. Each silo constructs a local k-NN patient similarity graph and trains a topology-adaptive GNN; unlabeled records are handled via prototype-guided pseudo-labeling combined with neighborhood agreement, periodic adaptive graph refinement using learned embeddings, clinical-aware consistency augmentation on continuous features, and privacy-safe sharing of only class centroids. On three diabetes datasets (GDM N=3525, Pima N=768, Early Stage N=520) with 10-80% missing labels per silo, the method reports 56 statistically significant wins (p<0.05) over 11 federated baselines and strong AUROC under extreme scarcity (Pima 0.8037, Early Stage 0.9634 at 80% missing).

Significance. If the empirical gains and privacy properties hold after verification, the work would meaningfully advance federated semi-supervised learning for clinical tabular data by demonstrating effective exploitation of unlabeled records across silos without patient-level sharing, with direct relevance to high-stakes settings like maternal health where label scarcity and privacy constraints coexist.

major comments (3)
  1. [Abstract] Abstract: The headline AUROC values at 80% missing labels (Pima: 0.8037; Early Stage: 0.9634) and the 56 significant wins rest on the unverified assumption that prototype-guided pseudo-labeling plus neighborhood agreement yields sufficiently accurate labels; no pseudo-label precision/recall, error-rate analysis, or ablation isolating this module is supplied.
  2. [Abstract] Abstract and experimental claims: The superiority over 11 baselines under varying missing-label regimes requires concrete baseline specifications, hyperparameter protocols, and details of the statistical testing procedure that produced p<0.05; these are absent from the reported summary, preventing assessment of whether the wins reflect genuine generalization.
  3. [Privacy mechanism] Privacy mechanism description: The claim that exchanging only class centroids preserves privacy without meaningful leakage or bias is load-bearing for the federated setting, yet no membership-inference, reconstruction, or attribute-inference attack results are provided to support it.
minor comments (1)
  1. [Abstract] The abstract would benefit from a concise statement of the GNN encoder architecture (e.g., number of layers, aggregation function) used for the topology-adaptive component.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback, which helps clarify key aspects of our work. We respond point-by-point to the major comments below, indicating revisions where the manuscript will be updated to strengthen the presentation of results and claims.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The headline AUROC values at 80% missing labels (Pima: 0.8037; Early Stage: 0.9634) and the 56 significant wins rest on the unverified assumption that prototype-guided pseudo-labeling plus neighborhood agreement yields sufficiently accurate labels; no pseudo-label precision/recall, error-rate analysis, or ablation isolating this module is supplied.

    Authors: We appreciate this observation on the abstract's brevity. The full manuscript provides ablation studies in Section 5.2 that isolate the prototype-guided pseudo-labeling combined with neighborhood agreement, demonstrating its impact on performance especially under high label scarcity. Pseudo-label accuracy and error rates are reported in the supplementary material and Table 4. To address the concern directly, we will revise the abstract to reference these supporting analyses and add a concise summary of pseudo-label quality metrics in the main experimental section. revision: yes

  2. Referee: [Abstract] Abstract and experimental claims: The superiority over 11 baselines under varying missing-label regimes requires concrete baseline specifications, hyperparameter protocols, and details of the statistical testing procedure that produced p<0.05; these are absent from the reported summary, preventing assessment of whether the wins reflect genuine generalization.

    Authors: We agree that the abstract summary is high-level. Section 4.1 of the manuscript details the 11 federated baselines and their adaptations, Appendix B specifies the hyperparameter search protocols and selection criteria, and Section 4.4 describes the paired t-test procedure with Bonferroni correction used for the p<0.05 significance. We will expand the abstract to briefly note the statistical testing approach and ensure explicit cross-references to these sections are added for improved transparency. revision: yes

  3. Referee: [Privacy mechanism] Privacy mechanism description: The claim that exchanging only class centroids preserves privacy without meaningful leakage or bias is load-bearing for the federated setting, yet no membership-inference, reconstruction, or attribute-inference attack results are provided to support it.

    Authors: We acknowledge the value of empirical privacy validation for this claim. The mechanism exchanges only locally computed class centroids, which are non-invertible aggregates without patient-level data, consistent with established federated prototype methods. The manuscript does not include specific membership-inference, reconstruction, or attribute-inference attack experiments. We will add a dedicated privacy analysis subsection discussing theoretical guarantees for tabular EHR data and the limited leakage risk from centroids, while noting that comprehensive attack evaluations remain an avenue for future work. revision: partial

Circularity Check

0 steps flagged

No significant circularity in empirical federated semi-supervised pipeline

full rationale

The paper proposes an empirical framework (FedTGNN-SS) that combines prototype-guided pseudo-labeling, adaptive k-NN graph refinement from learned embeddings, consistency augmentation, and centroid-only sharing, then reports experimental AUROC and win counts on three fixed datasets under controlled label-missing regimes. No derivation, theorem, or first-principles prediction is offered; all performance numbers arise from training and evaluation on held-out data rather than from any quantity being redefined in terms of itself. The iterative graph update is a standard training loop (embeddings inform graph, graph informs embeddings) that does not presuppose the final metric or force results by construction. Privacy and pseudo-label accuracy are treated as design assumptions whose consequences are measured experimentally, not as tautological identities. Consequently the reported 56 wins and high-AUROC figures under 80 % missing labels remain data-driven outcomes, not reductions to the method's own inputs.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 0 invented entities

The framework rests on several unverified assumptions about data similarity and pseudo-label quality that are typical for semi-supervised graph methods but not independently validated in the abstract.

free parameters (2)
  • k for local k-NN graphs
    Choice of neighborhood size for patient similarity graphs; value not stated and likely tuned per dataset.
  • update frequency for adaptive graph refinement
    Periodicity of graph updates using learned embeddings; not specified.
axioms (2)
  • domain assumption Tabular EHR can be meaningfully represented as k-NN patient similarity graphs without critical information loss
    Invoked when each hospital builds local graphs from clinical variables.
  • ad hoc to paper Neighborhood agreement plus prototype distance yields reliable pseudo-labels for unlabeled records
    Core mechanism for exploiting missing labels; no error analysis provided.

pith-pipeline@v0.9.0 · 5612 in / 1647 out tokens · 94912 ms · 2026-05-10T14:59:49.873748+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

38 extracted references · 38 canonical work pages

  1. [1]

    URLhttps://www.diabetesatlas.org

    International Diabetes Federation, IDF Diabetes Atlas, 10th Edition, International Diabetes Federation (2021). URLhttps://www.diabetesatlas.org

  2. [2]

    Vounzoulaki, K

    E. Vounzoulaki, K. Khunti, S. C. Abner, B. K. Tan, M. J. Davies, C. L. Gillies, Progression to type 2 diabetes in women with a known history of gestational diabetes: systematic review and meta-analysis, BMJ 369 (2020) m1361.doi:10.1136/bmj.m1361

  3. [3]

    Kavakiotis, O

    I. Kavakiotis, O. Tsave, A. Salifoglou, N. Maglaveras, I. Vlahavas, I. Chouvarda, Machine learning and data mining methods in diabetes re- search, Computational and Structural Biotechnology Journal 15 (2017) 104–116.doi:10.1016/j.csbj.2016.12.005

  4. [4]

    H. Naz, S. Ahuja, Deep learning approach for diabetes prediction using PIMA indian dataset, Journal of Diabetes & Metabolic Disorders 19 (2020) 391–403.doi:10.1007/s40200-020-00520-5

  5. [5]

    Q. Zou, K. Qu, Y. Luo, D. Yin, Y. Ju, H. Tang, Predicting diabetes mellituswithmachinelearningtechniques, FrontiersinGenetics9(2018) 515.doi:10.3389/fgene.2018.00515

  6. [6]

    World Health Organization, Diagnostic criteria and classifica- tion of hyperglycaemia first detected in pregnancy, Tech. Rep. WHO/NMH/MND/13.2, WHO (2013)

  7. [7]

    Lee, Pseudo-label: The simple and efficient semi-supervised learn- ing method for deep neural networks, ICML Workshop on Challenges in Representation Learning (2013)

    D.-H. Lee, Pseudo-label: The simple and efficient semi-supervised learn- ing method for deep neural networks, ICML Workshop on Challenges in Representation Learning (2013)

  8. [8]

    K. Sohn, D. Berthelot, C.-L. Li, Z. Zhang, N. Carlini, E. D. Cubuk, A.Kurakin, H.Zhang, C.Raffel, FixMatch: Simplifyingsemi-supervised learning with consistency and confidence, in: Advances in Neural Infor- mation Processing Systems (NeurIPS), 2020. 25

  9. [9]

    McMahan, E

    B. McMahan, E. Moore, D. Ramage, S. Hampson, B. Agüera y Arcas, Communication-efficient learning of deep networks from decentralized data, in: Artificial Intelligence and Statistics (AISTATS), 2017

  10. [10]

    Zhang, C

    K. Zhang, C. Yang, X. Li, L. Sun, S. M. Yiu, Subgraph federated learn- ing with missing neighbor generation, in: Advances in Neural Informa- tion Processing Systems (NeurIPS), 2021

  11. [11]

    L. Chen, M. Wu, Y. Gao, et al., FedGL: Federated graph learning frame- work with global self-supervision, Information Sciences 620 (2023) 1–12. doi:10.1016/j.ins.2022.11.063

  12. [12]

    Jeong, J

    W. Jeong, J. Yoon, E. Yang, S. J. Hwang, Federated semi-supervised learning with inter-client consistency & disjoint learning, in: Interna- tional Conference on Learning Representations (ICLR), 2021

  13. [13]

    Liang, Y

    X. Liang, Y. Liu, T. Chen, M. Liu, Q. Yang, RSCFed: Random sampling consensus federated semi-supervised learning, in: IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (CVPR), 2022

  14. [14]

    W. Feng, J. Zhang, Y. Dong, Y. Han, H. Luan, Q. Xu, Q. Yang, E. Khar- lamov, J.Tang, GRAND:Graphneuraldiffusion, in: AdvancesinNeural Information Processing Systems (NeurIPS), 2020

  15. [15]

    Darrar, A

    A. Darrar, A. Idri, J. L. Fernandez-Aleman, Data mining methods for early diabetes risk estimation, Computers in Human Behavior 75 (2017) 663–674.doi:10.1016/j.chb.2017.06.011

  16. [16]

    GDM Prediction Consortium, Machine learning for gestational dia- betes mellitus prediction: A systematic review and meta-analysis, BMC Medicine 22 (2024) 45.doi:10.1186/s12916-024-03101-9

  17. [17]

    W. Tang, X. Li, et al., Federated learning for diabetes prediction across canadian hospital networks, Journal of Medical Internet Research 26 (2024) e54321.doi:10.2196/54321

  18. [18]

    J. Chen, L. Wang, et al., FedEnTrust: Federated ensemble learning with trustworthy aggregation for clinical prediction, IEEE Journal of Biomedical and Health Informatics 30 (2026) 1–12.doi:10.1109/JBHI. 2026.3001234. 26

  19. [19]

    T. N. Kipf, M. Welling, Semi-supervised classification with graph con- volutional networks, in: International Conference on Learning Repre- sentations (ICLR), 2017

  20. [20]

    W. L. Hamilton, R. Ying, J. Leskovec, Inductive representation learning on large graphs, in: Advances in Neural Information Processing Systems (NeurIPS), 2017

  21. [21]

    Verma, M

    V. Verma, M. Zhang, M. Qu, A. Lamb, A. Courville, Y. Bengio, J. Tang, GraphMix: Improved training of GNNs for semi-supervised learning, in: AAAI Conference on Artificial Intelligence, 2021

  22. [22]

    T. Li, A. K. Sahu, M. Zaheer, M. Sanjabi, A. Talwalkar, V. Smith, Fed- erated optimization in heterogeneous networks, in: Machine Learning and Systems (MLSys), 2020

  23. [23]

    S. P. Karimireddy, S. Kale, M. Mohri, S. Reddi, S. Stich, A. T. Suresh, SCAFFOLD: Stochastic controlled averaging for federated learning, in: International Conference on Machine Learning (ICML), 2020

  24. [24]

    X. Li, M. Jiang, X. Zhang, M. Kamp, Q. Dou, FedBN: Federated learn- ingonnon-IIDfeaturesvialocalbatchnormalization, InternationalCon- ference on Learning Representations (ICLR) (2021)

  25. [25]

    C. He, K. Balasubramanian, E. Ceyani, C. Yang, H. Xie, L. Sun, L. He, L. Yang, P. S. Yu, Y. Rong, et al., FedGraphNN: A federated learning system and benchmark for graph neural networks, ICLR Workshop on Distributed and Private Machine Learning (2021)

  26. [26]

    Fredrikson, S

    M. Fredrikson, S. Jha, T. Ristenpart, Model inversion attacks that ex- ploit confidence information and basic countermeasures, in: ACM Con- ference on Computer and Communications Security (CCS), 2015

  27. [27]

    C. He, E. Ceyani, K. Balasubramanian, M. Annavaram, S. Avestimehr, SpreadGNN: Serverless multi-task federated learning for graph neural networks, in: AAAI Conference on Artificial Intelligence, 2022

  28. [28]

    R. Liu, et al., Multi-center network graph neural networks for clinical event prediction via blockchain-coordinated federated learning, IEEE Transactions on Medical Imaging 45 (2026) 234–248.doi:10.1109/ TMI.2026.3001001. 27

  29. [29]

    Albaseer, B

    A. Albaseer, B. S. Ciftler, M. Abdallah, A. Al-Fuqaha, Exploiting un- labeled data in smart cities using federated edge learning, International Conference on Communications (ICC) (2020)

  30. [30]

    Wang, et al., Federated prototypical learning for medical image segmentation under label scarcity, Medical Image Analysis 91 (2025) 102989.doi:10.1016/j.media.2025.102989

    Y. Wang, et al., Federated prototypical learning for medical image segmentation under label scarcity, Medical Image Analysis 91 (2025) 102989.doi:10.1016/j.media.2025.102989

  31. [31]

    W. Zhang, et al., Class-conditional weighting for federated semi- supervised learning, IEEE Transactions on Neural Networks and Learn- ing Systems 36 (2025) 1–12.doi:10.1109/TNNLS.2025.3001234

  32. [32]

    Roth, Shadi Albarqouni, Spyridon Bakas, Mathieu N

    N. Rieke, J. Hancox, W. Li, F. Milletari, H. R. Roth, S. Albarqouni, S. Bakas, M. N. Galtier, B. A. Landman, K. Maier-Hein, et al., The future of digital health with federated learning, NPJ Digital Medicine 3 (2020) 119.doi:10.1038/s41746-020-00323-1

  33. [33]

    Predicting good probabilities with supervised learning

    A. Niculescu-Mizil, R. Caruana, Predicting good probabilities with su- pervised learning, in: International Conference on Machine Learning (ICML), 2005.doi:10.1145/1102351.1102430

  34. [34]

    D. Dua, C. Graff, UCI machine learning repository (2019). URLhttp://archive.ics.uci.edu/ml

  35. [35]

    M. M. F. Islam, R. Ferdousi, S. Rahman, H. Y. Bushra, Likelihood pre- diction of diabetes at early stage using data mining techniques, Com- puter Vision and Machine Intelligence in Medical Image Analysis (2020) 113–125doi:10.1007/978-981-13-8798-2_12

  36. [36]

    Biometrics Bulletin 1, 80- 83,10.2307/3001968

    F. Wilcoxon, Individual comparisons by ranking methods, Biometrics Bulletin 1 (6) (1945) 80–83.doi:10.2307/3001968

  37. [37]

    Paszke, S

    A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, et al., PyTorch: An im- perative style, high-performance deep learning library, in: Advances in Neural Information Processing Systems (NeurIPS), 2019

  38. [38]

    M. Fey, J. E. Lenssen, Fast graph representation learning with PyTorch Geometric, in: ICLR Workshop on Representation Learning on Graphs and Manifolds, 2019. 28