Representation learning to advance multi-institutional studies with electronic health record data from US and France
Pith reviewed 2026-05-23 03:15 UTC · model grok-4.3
The pith
A graph-based framework aligns electronic health record vocabularies across institutions by learning a shared semantic space from local statistics, knowledge graphs, and language models.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The framework learns a shared semantic space by integrating institution-specific summary statistics from health records, curated biomedical knowledge graphs, and semantic information derived from large language models, thereby aligning diverse site-specific vocabularies while preserving patient privacy.
What carries the argument
A graph-based representation learning framework that jointly embeds institution-specific data summaries, biomedical knowledge graphs, and large language model-derived semantics into a unified space for vocabulary alignment.
If this is right
- Clinical models can be trained at one institution and deployed at others with aligned data representations.
- The method supports multi-institutional studies across different countries and languages.
- Privacy is maintained since only summary statistics are used, not individual patient records.
- Scalable harmonization is achieved without relying on fixed standards or manual mappings.
Where Pith is reading between the lines
- This approach might extend to other data types like imaging or genomic records if similar summary statistics and knowledge resources are available.
- Institutions could use the shared space to identify and correct inconsistencies in their own coding practices.
- Future work could test whether the alignment improves performance in specific clinical prediction tasks like disease diagnosis.
Load-bearing premise
That institution-specific summary statistics, curated biomedical knowledge graphs, and semantic information derived from large language models can be jointly learned into a shared semantic space that aligns diverse site-specific vocabularies.
What would settle it
Demonstrating that models using the learned alignments perform no better than those using random mappings or no alignment on cross-institution tasks would falsify the central claim.
Figures
read the original abstract
The widespread adoption of electronic health records has created new opportunities for translational clinical research, yet this promise remains constrained by fragmented data across privacy-siloed institutions and substantial heterogeneity in local coding practices. While privacy-preserving collaborative learning allows institutions to work together without sharing patient-level data, it does not address inconsistencies in how clinical concepts are represented across sites. We introduce a graph-based framework that addresses this gap by treating data harmonization as a scalable representation learning problem. Rather than relying on fixed standards or manual mappings, the framework integrates institution-specific summary statistics from health records, curated biomedical knowledge graphs, and semantic information derived from large language models to learn a shared semantic space. This joint learning approach aligns diverse, site-specific vocabularies while preserving patient privacy. Evaluated across seven institutions and two languages, the framework provides a robust, data-centric foundation for training and deploying clinical models across heterogeneous healthcare systems.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces a graph-based framework for data harmonization of electronic health records across privacy-siloed institutions. It treats harmonization as a representation learning task that jointly incorporates institution-specific summary statistics, curated biomedical knowledge graphs, and LLM-derived semantic information to produce a shared semantic space aligning site-specific vocabularies, while preserving patient privacy. The work claims evaluation across seven institutions and two languages as providing a robust foundation for multi-institutional clinical models.
Significance. If the central mechanism can be shown to work, the approach would address a genuine barrier in collaborative EHR research by moving beyond fixed standards or manual mappings. The data-centric framing and use of multiple heterogeneous inputs (summary stats + KGs + LLM semantics) are conceptually aligned with current needs in federated clinical modeling.
major comments (2)
- [Abstract] Abstract: the claim that the framework 'was evaluated across seven institutions' is unsupported because the abstract (and the supplied manuscript excerpt) contains no quantitative results, baselines, error metrics, ablation studies, or performance tables.
- [Abstract] Abstract: the core modeling claim—that institution-specific summary statistics, biomedical KGs, and LLM semantics can be jointly learned into an aligning shared space—lacks any description of the loss function, architecture, alignment objective (contrastive, reconstruction, graph alignment, etc.), or training procedure, rendering the joint-learning premise an unverified assumption rather than a demonstrated mechanism.
minor comments (1)
- [Abstract] The abstract could be revised to separate the high-level motivation from the specific technical contributions and to include at least one key quantitative result if the full manuscript contains it.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on the abstract. We agree that the abstract requires strengthening to better substantiate its claims and will revise it accordingly while preserving its brevity. We address each major comment below.
read point-by-point responses
-
Referee: [Abstract] Abstract: the claim that the framework 'was evaluated across seven institutions' is unsupported because the abstract (and the supplied manuscript excerpt) contains no quantitative results, baselines, error metrics, ablation studies, or performance tables.
Authors: We acknowledge that the current abstract states the evaluation scope without accompanying metrics. The full manuscript reports quantitative results including alignment F1 scores, cosine similarity improvements over baselines (e.g., direct KG matching and LLM-only embeddings), and ablation studies removing each input modality, all computed across the seven institutions. In revision we will insert a concise sentence summarizing key performance metrics and the evaluation scope to make the claim self-contained within the abstract. revision: yes
-
Referee: [Abstract] Abstract: the core modeling claim—that institution-specific summary statistics, biomedical KGs, and LLM semantics can be jointly learned into an aligning shared space—lacks any description of the loss function, architecture, alignment objective (contrastive, reconstruction, graph alignment, etc.), or training procedure, rendering the joint-learning premise an unverified assumption rather than a demonstrated mechanism.
Authors: The manuscript body specifies a graph neural network architecture with a composite objective: reconstruction loss on site-specific summary statistics, graph alignment loss on the biomedical KG edges, and contrastive loss aligning LLM-derived embeddings to the shared space, optimized via federated averaging. To address the abstract-level concern we will add a single clause describing the joint objective and alignment mechanism. revision: yes
Circularity Check
No significant circularity; framework relies on external inputs
full rationale
The paper introduces a graph-based representation learning framework that integrates institution-specific summary statistics, curated biomedical knowledge graphs, and LLM-derived semantics to produce a shared semantic space for vocabulary alignment. No equations, loss functions, or derivation steps are shown that reduce any claimed prediction or alignment result to a fitted parameter or input by construction. The approach depends on external resources (KGs, LLMs, site aggregates) rather than self-defining its outputs, and the provided text invokes no self-citations or uniqueness theorems as load-bearing justification. The central claim of cross-institution robustness is presented as an empirical outcome of the joint learning process, not a tautological renaming or definitional equivalence.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Carla Abdelnour, Federica Agosta, Marco Bozzali, Bertrand Fougère, Atsushi Iwata, Ramin Nil- forooshan, Leonel T Takada, Félix Viñuela, and Martin Traber. Perspectives and challenges in patient stratification in alzheimer’s disease.Alzheimer’s research & therapy, 14(1):112, 2022
work page 2022
-
[2]
Melissa J Armstrong, Shangchen Song, Andrea M Kurasz, and Zhigang Li. Predictors of mortality 7https://docs.smarthealthit.org/ 19 in individuals with dementia in the national alzheimer’s coordinating center.Journal of Alzheimer’s Disease, 86(4):1935–1946, 2022
work page 1935
- [3]
-
[4]
Tian Bai, Ashis Kumar Chanda, Brian L Egleston, and Slobodan Vucetic. Ehr phenotyping via jointly embedding medical concepts and words into a unified vector space.BMC medical informatics and decision making, 18:15–25, 2018
work page 2018
-
[5]
Tucker: Tensor factorization for knowledge graph completion
Ivana Balažević, Carl Allen, and Timothy M Hospedales. Tucker: Tensor factorization for knowledge graph completion. arXiv preprint arXiv:1901.09590, 2019
-
[6]
Andrew L. Beam, Benjamin Kompa, Allen Schmaltz, Inbar Fried, Griffin Weber, Nathan Palmer, Xu Shi, Tianxi Cai, and Isaac S. Kohane. Clinical concept embeddings learned from massive sources of multimodal medical data. In Biocomputing 2020. WORLD SCIENTIFIC, Nov 2019. doi: 10. 1142/9789811215636_0027. URL https://doi.org/10.1142%2F9789811215636_0027
work page 2020
-
[7]
A neural probabilistic language model
Yoshua Bengio, Réjean Ducharme, and Pascal Vincent. A neural probabilistic language model. JMLR, 3:1137–1155, 2003
work page 2003
-
[8]
O. Bodenreider. The unified medical language system (umls): integrating biomedical terminology. Nucleic Acids Research, page 267D – 270, Jan 2004. doi: 10.1093/nar/gkh061. URLhttp://dx. doi.org/10.1093/nar/gkh061
-
[9]
Cédric Bousquet, Béatrice Trombert, Julien Souvignet, Eric Sadou, and Jean-Marie Rodrigues. Evaluation of the ccam hierarchy and semi structured code for retrieving relevant procedures in a hospital case mix database. In AMIA Annual Symposium Proceedings, volume 2010, page 61. American Medical Informatics Association, 2010
work page 2010
-
[10]
International statistical classification of diseases and related health problems
Gerlind R Brämer. International statistical classification of diseases and related health problems. tenth revision.World health statistics quarterly. Rapport trimestriel de statistiques sanitaires mon- diales, 41(1):32–36, 1988
work page 1988
-
[11]
Tianxi Cai, Dong Xia, Luwan Zhang, and Doudou Zhou. Consensus knowledge graph learning via multi-view sparse low rank block model.arXiv preprint arXiv:2209.13762, 2022
-
[12]
Jianlv Chen, Shitao Xiao, Peitian Zhang, Kun Luo, Defu Lian, and Zheng Liu. Bge m3-embedding: Multi-lingual, multi-functionality, multi-granularity text embeddings through self-knowledge distil- lation. arXiv preprint arXiv:2402.03216, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[13]
Min Chen, Yongfeng Qian, Jing Chen, Kai Hwang, Shiwen Mao, and Long Hu. Privacy protec- tion and intrusion avoidance for cloudlet-based medical data sharing.IEEE transactions on Cloud computing, 8(4):1274–1283, 2016
work page 2016
-
[14]
Multi-layer representation learning for medi- cal concepts
Edward Choi, Mohammad Taha Bahadori, Elizabeth Searles, Catherine Coffey, Michael Thompson, James Bost, Javier Tejedor-Sojo, and Jimeng Sun. Multi-layer representation learning for medi- cal concepts. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1495–1504, 2016. 20
work page 2016
-
[15]
Andrea Cipriani, Corrado Barbui, Georgia Salanti, Jennifer Rendell, Rachel Brown, Sarah Stockton, Marianna Purgato, Loukia M Spineli, Guy M Goodwin, and John R Geddes. Comparative efficacy and acceptability of antimanic drugs in acute mania: a multiple-treatments meta-analysis. The Lancet, 378(9799):1306–1315, 2011
work page 2011
-
[16]
Finale Doshi-Velez, Yaorong Ge, and Isaac Kohane. Comorbidity clusters in autism spectrum dis- orders: an electronic health record time-series analysis.Pediatrics, 133(1):e54–e63, 2014
work page 2014
-
[17]
Louis Favril, Rongqin Yu, Abdo Uyar, Michael Sharpe, and Seena Fazel. Risk factors for suicide in adults: systematic review and meta-analysis of psychological autopsy studies.BMJ Ment Health, 25(4):148–155, 2022
work page 2022
-
[18]
Seena Fazel and Bo Runeson. Suicide. New England Journal of Medicine, 382(3):266–274, 2020. doi: 10.1056/NEJMra1902944
-
[19]
Gnaeus: Utilizing clinical guidelines for knowledge-assisted visualisation of ehr cohorts
Paolo Federico, Jürgen Unger, Albert Amor-Amorós, Lucia Sacchi, Denis Klimov, and Silvia Miksch. Gnaeus: Utilizing clinical guidelines for knowledge-assisted visualisation of ehr cohorts. InEuroVA@ EuroVis, pages 79–83, 2015
work page 2015
-
[20]
Thomas Ferté, Vianney Jouhet, Romain Griffier, Boris P Hejblum, and Rodolphe Thiébaut. The benefit of augmenting open data with clinical data-warehouse ehr for forecasting sars-cov-2 hospi- talizations in bordeaux area, france.JAMIA open, 5(4):ooac086, 2022
work page 2022
-
[21]
Ziming Gan, Doudou Zhou, Everett Rush, Vidul A Panickan, Yuk-Lam Ho, George Ostrouchov, Zhiwei Xu, Shuting Shen, Xin Xiong, Kimberly F Greco, et al. ARCH: Large-scale knowledge graph via aggregated narrative codified health records analysis.medRxiv, 2023
work page 2023
-
[22]
A new model for learning in graph domains
Marco Gori, Gabriele Monfardini, and Franco Scarselli. A new model for learning in graph domains. InProceedings. 2005 IEEE international joint conference on neural networks, 2005., volume 2, pages 729–734. IEEE, 2005
work page 2005
-
[23]
Yu Gu, Robert Tinn, Hao Cheng, Michael Lucas, Naoto Usuyama, Xiaodong Liu, Tristan Naumann, Jianfeng Gao, and Hoifung Poon. Domain-specific language model pretraining for biomedical natural language processing.ACM Transactions on Computing for Healthcare (HEALTH), 3(1):1–23, 2021
work page 2021
-
[24]
Lukas Heumos, Philipp Ehmele, Tim Treis, Julius Upmeier zu Belzen, Eljas Roellin, Lilly May, Altana Namsaraeva, Nastassya Horlava, Vladimir A Shitov, Xinyue Zhang, et al. An open-source framework for end-to-end analysis of electronic health record data.Nature medicine, 30(11):3369– 3380, 2024
work page 2024
-
[25]
Chuan Hong, Everett Rush, Molei Liu, Doudou Zhou, Jiehuan Sun, Aaron Sonabend, Victor M Castro, Petra Schubert, Vidul A Panickan, Tianrun Cai, et al. Clinical knowledge extraction via sparse embedding regression (keser) with multi-center large scale electronic health record data. medRxiv, 2021
work page 2021
-
[26]
Psychosis in alzheimer disease—mechanisms, genetics and therapeutic opportunities
Zahinoor Ismail, Byron Creese, Dag Aarsland, Helen C Kales, Constantine G Lyketsos, Robert A Sweet, and Clive Ballard. Psychosis in alzheimer disease—mechanisms, genetics and therapeutic opportunities. Nature Reviews Neurology, 18(3):131–144, 2022
work page 2022
-
[27]
A Johnson, L Bulgarelli, T Pollard, S Horng, L A Celi, and R Mark. MIMIC-IV (version 0.4). PhysioNet., 2020. 21
work page 2020
-
[28]
Code2vec: Embed- ding and clustering medical diagnosis data
David Kartchner, Tanner Christensen, Jeffrey Humpherys, and Sean Wade. Code2vec: Embed- ding and clustering medical diagnosis data. In2017 IEEE International Conference on Healthcare Informatics, pages 386–390, 2017
work page 2017
-
[29]
Isotta Landi, Benjamin S Glicksberg, Hao-Chih Lee, Sarah Cherng, Giulia Landi, Matteo Danieletto, Joel T Dudley, Cesare Furlanello, and Riccardo Miotto. Deep representation learning of electronic health records to unlock patient stratification at scale.NPJ digital medicine, 3(1):96, 2020
work page 2020
-
[30]
Dongha Lee, Xiaoqian Jiang, and Hwanjo Yu. Harmonized representation learning on dynamic ehr graphs. Journal of biomedical informatics, 106:103426, June 2020. ISSN 1532-0464. doi: 10.1016/j. jbi.2020.103426. URL https://doi.org/10.1016/j.jbi.2020.103426
work page doi:10.1016/j 2020
-
[31]
Biobert: a pre-trained biomedical language representation model for biomedical text mining
Jinhyuk Lee, Wonjin Yoon, Sungdong Kim, Donghyeon Kim, Sunkyu Kim, Chan Ho So, and Jaewoo Kang. Biobert: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics, 36(4):1234–1240, 2020
work page 2020
-
[32]
Omer Levy and Yoav Goldberg. Neural word embedding as implicit matrix factorization.Advances in neural information processing systems, 27, 2014
work page 2014
-
[33]
Li Li, Wei-Yi Cheng, Benjamin S Glicksberg, Omri Gottesman, Ronald Tamler, Rong Chen, Erwin P Bottinger, and Joel T Dudley. Identification of type 2 diabetes subgroups through topological analysis of patient similarity.Science translational medicine, 7(311):311ra174–311ra174, 2015
work page 2015
-
[34]
Katherine P Liao, Tianxi Cai, Guergana K Savova, Shawn N Murphy, Elizabeth W Karlson, Ash- win N Ananthakrishnan, Vivian S Gainer, Stanley Y Shaw, Zongqi Xia, Peter Szolovits, et al. Development of phenotype algorithms using electronic medical records and incorporating natural language processing. bmj, 350, 2015
work page 2015
-
[35]
Multimodal learning on graphs for disease relation extraction
Yucong Lin, Keming Lu, Sheng Yu, Tianxi Cai, and Marinka Zitnik. Multimodal learning on graphs for disease relation extraction. CoRR, abs/2203.08893, 2022. doi: 10.48550/ARXIV.2203.08893. URL https://doi.org/10.48550/arXiv.2203.08893
-
[36]
Self-alignment pretraining for biomedical entity representations
Fangyu Liu, Ehsan Shareghi, Zaiqiao Meng, Marco Basaldella, and Nigel Collier. Self-alignment pretraining for biomedical entity representations. In Kristina Toutanova, Anna Rumshisky, Luke Zettlemoyer, Dilek Hakkani-Tur, Iz Beltagy, Steven Bethard, Ryan Cotterell, Tanmoy Chakraborty, and Yichao Zhou, editors,Proceedings of the 2021 Conference of the North...
-
[37]
URL https://aclanthology.org/2021.naacl-main.334
work page 2021
-
[38]
The role of nmda receptors in alzheimer’s disease.Frontiers in neuroscience, 13:43, 2019
Jinping Liu, Lirong Chang, Yizhi Song, Hui Li, and Yan Wu. The role of nmda receptors in alzheimer’s disease.Frontiers in neuroscience, 13:43, 2019
work page 2019
-
[39]
Clement J McDonald, Stanley M Huff, Jeffrey G Suico, Gilbert Hill, Dennis Leavelle, Raymond Aller, Arden Forrey, Kathy Mercer, Georges DeMoor, John Hook, et al. Loinc, a universal standard for identifying laboratory observations: a 5-year update.Clinical chemistry, 49(4):624–633, 2003
work page 2003
-
[40]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. Distributed representa- tions of words and phrases and their compositionality.Adv Neural Inf Process Syst, 26:3111–3119, 2013. 22
work page 2013
-
[41]
Soheila Molaei, Anshul Thakur, Ghazaleh Niknam, Andrew Soltan, Hadi Zare, and David A Clifton. Federated learning for heterogeneous electronic health records utilising augmented temporal graph attention networks. InInternational Conference on Artificial Intelligence and Statistics, pages 1342–
- [42]
-
[43]
International classification of diseases—ninth revision (icd-9)
World Health Organization et al. International classification of diseases—ninth revision (icd-9). Weekly Epidemiological Record= Relevé épidémiologique hebdomadaire, 63(45):343–344, 1988
work page 1988
-
[44]
Glove: Global vectors for word representation
Jeffrey Pennington, Richard Socher, and Christopher D Manning. Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pages 1532–1543, 2014
work page 2014
-
[45]
Micah Sheller, Brandon Edwards, G. Reina, Jason Martin, Sarthak Pati, Aikaterini Kotrotsou, Mikhail Milchenko, Weilin Xu, Daniel Marcus, Rivka Colen, and Spyridon Bakas. Federated learning in medicine: facilitating multi-institutional collaborations without sharing patient data.Scientific Reports, 10, 07 2020. doi: 10.1038/s41598-020-69250-1
-
[46]
Biomegatron: larger biomedical domain language model
Hoo-Chang Shin, Yang Zhang, Evelina Bakhturina, Raul Puri, Mostofa Patwary, Mohammad Shoeybi, and Raghav Mani. Biomegatron: larger biomedical domain language model. In Pro- ceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 4700–4706, 2020
work page 2020
-
[47]
Roshan Sutar, Akash Kumar, and Vikas Yadav. Suicide and prevalence of mental disorders: A systematic review and meta-analysis of world data on case-control psychological autopsy studies. Psychiatry research, page 115492, 2023
work page 2023
-
[48]
Federated k-means clustering.arXiv preprint arXiv:2310.01195, 2024
Marcel Reinders Swier Garst. Federated k-means clustering.arXiv preprint arXiv:2310.01195, 2024
-
[49]
Pierre N. Tariot, Martin R. Farlow, George T. Grossberg, Stephen M. Graham, Scott McDonald, Ivan Gergel, and for the Memantine Study Group. Memantine treatment in patients with moderate to severe alzheimer disease already receiving donepezila randomized controlled trial.JAMA, 291(3): 317–324, 01 2004. ISSN 0098-7484. doi: 10.1001/jama.291.3.317
-
[50]
Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. Graph attention networks. InInternational Conference on Learning Representations, 2018. URL https://openreview.net/forum?id=rJXMpikCZ
work page 2018
-
[51]
Risk and Trust Perceptions of the Public of Artifical Intelligence Applications
Ke Wang, Ning Chen, and Ting Chen. Joint medical ontology representation learning for healthcare predictions. In2020 International Joint Conference on Neural Networks (IJCNN), pages 1–7, 2020. doi: 10.1109/IJCNN48605.2020.9207355
-
[52]
Linshanshan Wang, Shruthi Venkatesh, Michele Morris, Mengyan Li, Ratnam Srivastava, Shyam Visweswaran, Oscar Lopez, Zongqi Xia, and Tianxi Cai. Stratification of alzheimer’s disease patients using knowledge-guided unsupervised latent factor clustering with electronic health record data. medRxiv, 2024. doi: 10.1101/2024.12.23.24319588. URL https://www.medr...
-
[53]
Multi-similarity loss with general pair weighting for deep metric learning
Xun Wang, Xintong Han, Weilin Huang, Dengke Dong, and Matthew R Scott. Multi-similarity loss with general pair weighting for deep metric learning. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5022–5030, 2019. 23
work page 2019
-
[54]
Knowledge graph embedding by trans- lating on hyperplanes
Zhen Wang, Jianwen Zhang, Jianlin Feng, and Zheng Chen. Knowledge graph embedding by trans- lating on hyperplanes. InProceedings of the AAAI Conference on Artificial Intelligence, volume 28, 2014
work page 2014
-
[55]
Mandl, Suchun Cheng, Zongqi Xia, Kelly Cho, J
Xin Xiong, Sara Morini Sweet, Molei Liu, Chuan Hong, Clara-Lea Bonzel, Vidul Ayakulangara Panickan, Doudou Zhou, Linshanshan Wang, Lauren Costa, Yuk-Lam Ho, Alon Geva, Kenneth D. Mandl, Suchun Cheng, Zongqi Xia, Kelly Cho, J. Michael Gaziano, Katherine P. Liao, Tianxi Cai, and Tianrun Cai. Knowledge-driven online multimodal automated phenotyping system.medRxiv,
-
[56]
URL https://www.medrxiv.org/content/early/2023/ 10/02/2023.09.29.23296239
doi: 10.1101/2023.09.29.23296239. URL https://www.medrxiv.org/content/early/2023/ 10/02/2023.09.29.23296239
-
[57]
Kg-bert: Bert for knowledge graph completion.arXiv preprint arXiv:1909.03193, 2019
Liang Yao, Chengsheng Mao, and Yuan Luo. Kg-bert: Bert for knowledge graph completion.arXiv preprint arXiv:1909.03193, 2019
-
[58]
Zheng Yuan, Zhengyun Zhao, Haixia Sun, Jiao Li, Fei Wang, and Sheng Yu. Coder: Knowledge- infused cross-lingual medical term embedding for term normalization.Journal of Biomedical Infor- matics, 126:103983, 2022
work page 2022
-
[59]
Xiaoting Zheng, Shichan Wang, Jingxuan Huang, Chunyu Li, and Huifang Shang. Predictors for survival in patients with alzheimer’s disease: a large comprehensive meta-analysis.Translational Psychiatry, 14(1):184, 2024
work page 2024
-
[60]
Panickan, Chuan Hong, Yuk-Lam Ho, Tianrun Cai, Lauren Costa, Xiaoou Li, Victor M
Doudou Zhou, Ziming Gan, Xu Shi, Alina Patwari, Everett Rush, Clara-Lea Bonzel, Vidul A. Panickan, Chuan Hong, Yuk-Lam Ho, Tianrun Cai, Lauren Costa, Xiaoou Li, Victor M. Castro, Shawn N. Murphy, Gabriel Brat, Griffin Weber, Paul Avillach, J. Michael Gaziano, Kelly Cho, Katherine P. Liao, Junwei Lu, and Tianxi Cai. Multiview incomplete knowledge graph int...
-
[61]
Doudou Zhou, Yufeng Zhang, Aaron Sonabend-W, Zhaoran Wang, Junwei Lu, and Tianxi Cai. Federated offline reinforcement learning. Journal of the American Statistical Association, pages 1–12, 2024. 24 Supplementary Material Representation Learning to Advance Multi-Institutional Studies with Electronic Health Record Data S.1 Training and validation data base ...
work page 2024
-
[62]
In the similarity training step, we save the embedding with the highest code mapping accuracy, as detailed in Algorithm 2. In the relatedness training step, we save the embedding with the highest feature selection correlation, also detailed in Algorithm 2. When splitting the training and validation sets, we divide the similar hierarchical pairs according ...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.