Snomed2Vec: Random Walk and Poincar\'e Embeddings of a Clinical Knowledge Base for Healthcare Analytics
Pith reviewed 2026-05-24 19:03 UTC · model grok-4.3
The pith
Embeddings from the SNOMED-CT knowledge graph outperform those from electronic health records on medical tasks.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Concept embeddings derived from the SNOMED-CT knowledge graph significantly outperform state-of-the-art embeddings, showing 5-6x improvement in concept similarity and 6-20% improvement in patient diagnosis.
What carries the argument
Random walk and Poincaré embeddings applied to the SNOMED-CT knowledge graph to produce vector representations of medical concepts.
If this is right
- Improved accuracy on node classification tasks within the medical concept graph.
- Stronger results on link prediction for relationships between medical concepts.
- Better performance in predicting patient states using the learned embeddings.
- More effective representations for downstream healthcare analytics models.
Where Pith is reading between the lines
- Knowledge graphs like SNOMED-CT may capture structured medical relationships that raw patient data alone does not provide as effectively.
- This method could be adapted to other specialized domains with rich ontologies to improve embedding quality.
- Patient-level predictions might benefit from combining graph-derived embeddings with other data sources in future models.
Load-bearing premise
The structure and content of the SNOMED-CT knowledge graph supply a signal that is both richer and more generalizable to patient-level prediction tasks than embeddings learned directly from electronic health records.
What would settle it
Running the patient diagnosis prediction task on an independent dataset and finding that the SNOMED-CT embeddings do not achieve the reported 6-20% improvement over baselines.
Figures
read the original abstract
Representation learning methods that transform encoded data (e.g., diagnosis and drug codes) into continuous vector spaces (i.e., vector embeddings) are critical for the application of deep learning in healthcare. Initial work in this area explored the use of variants of the word2vec algorithm to learn embeddings for medical concepts from electronic health records or medical claims datasets. We propose learning embeddings for medical concepts by using graph-based representation learning methods on SNOMED-CT, a widely popular knowledge graph in the healthcare domain with numerous operational and research applications. Current work presents an empirical analysis of various embedding methods, including the evaluation of their performance on multiple tasks of biomedical relevance (node classification, link prediction, and patient state prediction). Our results show that concept embeddings derived from the SNOMED-CT knowledge graph significantly outperform state-of-the-art embeddings, showing 5-6x improvement in ``concept similarity" and 6-20\% improvement in patient diagnosis.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes Snomed2Vec, which applies random-walk and Poincaré embedding methods to the SNOMED-CT knowledge graph to produce vector representations of medical concepts. It reports empirical results on node classification, link prediction, and patient-state prediction tasks, claiming that the resulting embeddings significantly outperform prior state-of-the-art embeddings learned from electronic health records or claims data via word2vec-style methods, with 5-6× gains on a concept-similarity metric and 6-20% gains on patient diagnosis.
Significance. If the reported gains survive controlled re-evaluation that removes the alignment between training graph and test metric, the work would supply a reusable clinical embedding resource and demonstrate the value of structured knowledge graphs over purely observational EHR data for downstream healthcare tasks. The patient-diagnosis result is the more consequential of the two claims.
major comments (2)
- [Experiments (node classification / link prediction subsection)] The headline 5-6× improvement on “concept similarity” is reported without an explicit definition of the metric or the train/test split used for link prediction / similarity evaluation. Because the random-walk and Poincaré models are trained precisely to recover SNOMED-CT is-a and attribute relations, any similarity or link-prediction score computed on the same graph necessarily favors the graph embeddings over EHR-derived baselines that never observe those relations. This alignment must be quantified (e.g., by reporting AUC on held-out SNOMED edges versus a non-graph baseline) before the claim of richer clinical semantics can be accepted.
- [Patient state prediction experiments] The 6-20% improvement on patient diagnosis is presented without details on how the SNOMED embeddings are injected into the downstream model, whether the patient cohort overlaps with the SNOMED concepts used for training, or what statistical tests and data splits were employed. These controls are required to establish that the gain is attributable to the graph signal rather than to differences in feature dimensionality, preprocessing, or cohort construction.
minor comments (2)
- [Abstract / Introduction] The abstract and introduction should cite the exact prior EHR embedding papers (e.g., Choi et al., 2016; Choi et al., 2017) that constitute the “state-of-the-art” baselines so that readers can verify implementation fidelity.
- [Methods] Notation for the Poincaré model (hyperbolic distance, curvature parameter) and the random-walk hyperparameters (walk length, number of walks) should be stated explicitly in the methods section.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. The concerns about missing experimental details are valid, and we will revise the manuscript to address them directly while preserving the core claims.
read point-by-point responses
-
Referee: [Experiments (node classification / link prediction subsection)] The headline 5-6× improvement on “concept similarity” is reported without an explicit definition of the metric or the train/test split used for link prediction / similarity evaluation. Because the random-walk and Poincaré models are trained precisely to recover SNOMED-CT is-a and attribute relations, any similarity or link-prediction score computed on the same graph necessarily favors the graph embeddings over EHR-derived baselines that never observe those relations. This alignment must be quantified (e.g., by reporting AUC on held-out SNOMED edges versus a non-graph baseline) before the claim of richer clinical semantics can be accepted.
Authors: We agree the manuscript omits an explicit definition of the concept similarity metric and the precise train/test splits. The reported gains used a held-out subset of SNOMED-CT edges for link prediction and similarity ranking (with cosine similarity as the underlying measure), evaluated identically for all methods including the EHR baselines. To resolve the alignment concern we will add: (1) the exact metric definition and split ratios, (2) AUC on held-out SNOMED edges, and (3) results against a non-graph baseline (e.g., random or frequency-based) as suggested. These additions will appear in a new subsection of the experiments. revision: yes
-
Referee: [Patient state prediction experiments] The 6-20% improvement on patient diagnosis is presented without details on how the SNOMED embeddings are injected into the downstream model, whether the patient cohort overlaps with the SNOMED concepts used for training, or what statistical tests and data splits were employed. These controls are required to establish that the gain is attributable to the graph signal rather than to differences in feature dimensionality, preprocessing, or cohort construction.
Authors: The manuscript is missing these implementation details. SNOMED embeddings were injected as fixed feature vectors into a downstream logistic regression or neural network classifier for diagnosis prediction; the patient cohort was drawn from an independent EHR source with no overlap in the training instances. We will expand the patient-state section to specify the injection procedure, confirm cohort separation, report the train/validation/test splits, and include statistical tests (paired t-test and McNemar’s test) with p-values. This will be added as a dedicated paragraph with a new table of controls. revision: yes
Circularity Check
No significant circularity in empirical embedding evaluation
full rationale
The paper reports standard empirical comparisons of graph embedding methods (random walk, Poincaré) trained on the SNOMED-CT knowledge graph against EHR-derived baselines, with performance measured on node classification, link prediction, and patient diagnosis tasks. No derivation chain, first-principles result, or mathematical prediction is presented that reduces to its own inputs by construction. Link prediction and similarity metrics are conventional held-out evaluations for graph embeddings; the patient diagnosis task provides an external downstream benchmark independent of the training graph structure. Any self-citations are not load-bearing for the central empirical claims.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Cost (CostAlphaLog, Jcost)costAlphaLog_fourth_deriv_at_zero; dAlembert_to_ODE_general echoes?
echoesECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.
The distance between two points u and v within the Poincaré ball is given as d(u,v) = cosh^{-1}(1 + 2 ||u −v ||² / ((1 − || u||²)(1 − ||v ||²))) ... loss function ... log e^{-d(u,v)} / sum e^{-d(u,v1)}
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
A L Beam, B Kompa, I Fried, N P Palmer, X Shi, T Cai, and I S Kohane. Clinical concept embeddings learned from massive sources of medical data.arXiv preprint arXiv:1804.01486, 2018
-
[2]
Multi-layer representation learning for medical concepts
Edward Choi, Mohammad Taha Bahadori, Elizabeth Searles, Catherine Coffey, Michael Thompson, James Bost, Javier Tejedor-Sojo, and Jimeng Sun. Multi-layer representation learning for medical concepts. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , pp. 1495–1504. ACM, 2016
work page 2016
-
[3]
Gram: graph-based attention model for healthcare representation learning
Edward Choi, Mohammad Taha Bahadori, Le Song, Walter F Stewart, and Jimeng Sun. Gram: graph-based attention model for healthcare representation learning. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , pp. 787–795. ACM, 2017
work page 2017
-
[4]
Medical semantic similarity with a neural language model
Lance De Vine, Guido Zuccon, Bevan Koopman, Laurianne Sitbon, and Peter Bruza. Medical semantic similarity with a neural language model. In Proceed- ings of the 23rd ACM international conference on conference on information and knowledge management, pp. 1819–1822. ACM, 2014
work page 2014
-
[5]
Embedding Text in Hyperbolic Spaces
B Dhingra, C J Shallue, M Norouzi, A M Dai, and G E Dahl. Embedding text in hyperbolic spaces. arXiv preprint arXiv:1806.04313, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[6]
metapath2vec: Scalable representation learning for heterogeneous networks
Yuxiao Dong, Nitesh V Chawla, and Ananthram Swami. metapath2vec: Scalable representation learning for heterogeneous networks. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining , pp. 135–144. ACM, 2017
work page 2017
-
[7]
SNOMED-CT: The advanced terminology and coding system for eHealth
K Donnelly. SNOMED-CT: The advanced terminology and coding system for eHealth. Studies in health technology and informatics , 121:279, 2006
work page 2006
-
[8]
node2vec: Scalable feature learning for net- works
Aditya Grover and Jure Leskovec. node2vec: Scalable feature learning for net- works. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining , pp. 855–864. ACM, 2016
work page 2016
-
[9]
MIMIC-III, a freely accessible critical care database
Shen L Lehman L Feng M Ghassemi M Moody B Szolovits P Celi LA John- son AEW, Pollard TJ and Mark RG. MIMIC-III, a freely accessible critical care database. Scientific Data (2016). DOI: 10.1038/sdata.2016.35. A vailable at: http://www.nature.com/articles/sdata201635, 2016
-
[10]
Alexa T McCray. The UMLS semantic network. In Proceedings. Symposium on Computer Applications in Medical Care , pp. 503–507. American Medical Informat- ics Association, 1989
work page 1989
-
[11]
J A Miñarro-Giménez, O Marín-Alonso, and M Samwald. Applying deep learning techniques on medical corpora from the World Wide Web: a prototypical system and evaluation. CoRR, abs/1502.03682, 2015. URL http://arxiv.org/abs/1502.03682
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[12]
Poincaré embeddings for learning hierar- chical representations
Maximillian Nickel and Douwe Kiela. Poincaré embeddings for learning hierar- chical representations. In Advances in neural information processing systems , pp. 6338–6347, 2017
work page 2017
-
[13]
Deepwalk: Online learning of social representations
Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining , pp. 701–710. ACM, 2014
work page 2014
-
[14]
Software Framework for Topic Modelling with Large Corpora
R Řehůřek and P Sojka. Software Framework for Topic Modelling with Large Corpora. In LREC Workshop on New Challenges for NLP Frameworks , 2010
work page 2010
-
[15]
Link prediction based on graph neural networks
Muhan Zhang and Yixin Chen. Link prediction based on graph neural networks. In Advances in Neural Information Processing Systems , pp. 5165–5175, 2018. 5
work page 2018
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.