Snomed2Vec: Random Walk and Poincar\'e Embeddings of a Clinical Knowledge Base for Healthcare Analytics

Khushbu Agarwal; Raghavendra Addanki; Robert Rallo; Sutanay Choudhury; Suzanne Tamang; Tome Eftimov

arxiv: 1907.08650 · v1 · pith:NJIQS2CEnew · submitted 2019-07-19 · 💻 cs.LG · cs.AI· stat.ML

Snomed2Vec: Random Walk and Poincar\'e Embeddings of a Clinical Knowledge Base for Healthcare Analytics

Khushbu Agarwal , Tome Eftimov , Raghavendra Addanki , Sutanay Choudhury , Suzanne Tamang , Robert Rallo This is my paper

Pith reviewed 2026-05-24 19:03 UTC · model grok-4.3

classification 💻 cs.LG cs.AIstat.ML

keywords SNOMED-CTgraph embeddingsrandom walk embeddingsPoincaré embeddingsconcept similaritypatient diagnosis predictionhealthcare analyticsknowledge graph representation learning

0 comments

The pith

Embeddings from the SNOMED-CT knowledge graph outperform those from electronic health records on medical tasks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper aims to establish that graph-based representation learning on the SNOMED-CT clinical ontology produces higher-quality embeddings for medical concepts than methods trained on patient records or claims data. A reader would care because improved embeddings support better performance in tasks such as identifying similar concepts and predicting patient diagnoses from limited data. The authors apply random walk and Poincaré embedding techniques to the knowledge graph and test them on node classification, link prediction, and patient state prediction. Their evaluation indicates clear advantages in these biomedical applications.

Core claim

Concept embeddings derived from the SNOMED-CT knowledge graph significantly outperform state-of-the-art embeddings, showing 5-6x improvement in concept similarity and 6-20% improvement in patient diagnosis.

What carries the argument

Random walk and Poincaré embeddings applied to the SNOMED-CT knowledge graph to produce vector representations of medical concepts.

If this is right

Improved accuracy on node classification tasks within the medical concept graph.
Stronger results on link prediction for relationships between medical concepts.
Better performance in predicting patient states using the learned embeddings.
More effective representations for downstream healthcare analytics models.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Knowledge graphs like SNOMED-CT may capture structured medical relationships that raw patient data alone does not provide as effectively.
This method could be adapted to other specialized domains with rich ontologies to improve embedding quality.
Patient-level predictions might benefit from combining graph-derived embeddings with other data sources in future models.

Load-bearing premise

The structure and content of the SNOMED-CT knowledge graph supply a signal that is both richer and more generalizable to patient-level prediction tasks than embeddings learned directly from electronic health records.

What would settle it

Running the patient diagnosis prediction task on an independent dataset and finding that the SNOMED-CT embeddings do not achieve the reported 6-20% improvement over baselines.

Figures

Figures reproduced from arXiv: 1907.08650 by Khushbu Agarwal, Raghavendra Addanki, Robert Rallo, Sutanay Choudhury, Suzanne Tamang, Tome Eftimov.

**Figure 1.** Figure 1: Schema of the subset of SNOMED-CT knowledge graph extracted from UMLS. ‘n’ denotes the number of concepts of [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗

**Figure 2.** Figure 2: Illustration of semantic types and rela [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗

**Figure 3.** Figure 3: Visualization of the SNOMED-X graph embeddings (d=500) learned by Node2vec (top left), Metapath2vec (middle) and Poincaré (right). The shape of the visualizations demonstrate the distinct method objective and embedding characteristics (Node2vec: neighbourhood correlations; Metapath2vec: distinct node types; Poincare: hierarchical relations) (2) Link prediction: We test the accuracy of learned embeddings i… view at source ↗

**Figure 4.** Figure 4: Architecture of the deep learning model to predict [PITH_FULL_IMAGE:figures/full_fig_p003_4.png] view at source ↗

**Figure 5.** Figure 5: Left to right: (a) Bootstrap distribution of cosine similarity for each method on D [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗

read the original abstract

Representation learning methods that transform encoded data (e.g., diagnosis and drug codes) into continuous vector spaces (i.e., vector embeddings) are critical for the application of deep learning in healthcare. Initial work in this area explored the use of variants of the word2vec algorithm to learn embeddings for medical concepts from electronic health records or medical claims datasets. We propose learning embeddings for medical concepts by using graph-based representation learning methods on SNOMED-CT, a widely popular knowledge graph in the healthcare domain with numerous operational and research applications. Current work presents an empirical analysis of various embedding methods, including the evaluation of their performance on multiple tasks of biomedical relevance (node classification, link prediction, and patient state prediction). Our results show that concept embeddings derived from the SNOMED-CT knowledge graph significantly outperform state-of-the-art embeddings, showing 5-6x improvement in ``concept similarity" and 6-20\% improvement in patient diagnosis.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

SNOMED graph embeddings beat EHR baselines on concept similarity, but the gains likely stem from the evaluation matching the training graph structure.

read the letter

The main thing to know is that this paper applies random-walk and Poincaré embeddings to the SNOMED-CT knowledge graph and reports large gains over word2vec-style baselines from electronic health records. The 5-6x improvement on concept similarity looks impressive on paper but likely reflects an uneven playing field in the evaluation. The new part is the specific use of these graph embedding techniques on SNOMED-CT for biomedical tasks like node classification, link prediction, and patient state prediction. The authors run an empirical comparison and show that the graph-derived embeddings help on these tasks. The patient diagnosis results, with 6-20% gains, are the more relevant finding because they move to actual clinical prediction rather than staying inside the ontology. What the paper does well is demonstrate a practical way to get concept vectors from an existing medical knowledge graph. SNOMED-CT is widely used, so having embeddings that leverage its structure could be handy for analytics work. The soft spot is the evaluation of concept similarity. As the stress-test note points out, this metric is probably based on link prediction or similarity within the SNOMED graph itself. The random walk and Poincaré methods are designed to reconstruct that graph, while the baselines trained on claims data never see it. That explains the large gap without necessarily showing better clinical semantics. The patient-level results are smaller and more interesting, but the abstract does not provide details on experimental controls, data splits, or statistical testing, which makes it hard to judge how solid they are. This work is aimed at people doing healthcare analytics who need better concept representations. A reader already familiar with embedding methods in biomedicine would get the most out of it as an example of applying graph techniques to this ontology. I would bring it to a reading group if the group is focused on medical NLP or graph embeddings in healthcare. I would not cite it myself unless the full experiments address the evaluation concern. It deserves peer review because the core idea is straightforward and the results could be useful if the controls are in place.

Referee Report

2 major / 2 minor

Summary. The paper proposes Snomed2Vec, which applies random-walk and Poincaré embedding methods to the SNOMED-CT knowledge graph to produce vector representations of medical concepts. It reports empirical results on node classification, link prediction, and patient-state prediction tasks, claiming that the resulting embeddings significantly outperform prior state-of-the-art embeddings learned from electronic health records or claims data via word2vec-style methods, with 5-6× gains on a concept-similarity metric and 6-20% gains on patient diagnosis.

Significance. If the reported gains survive controlled re-evaluation that removes the alignment between training graph and test metric, the work would supply a reusable clinical embedding resource and demonstrate the value of structured knowledge graphs over purely observational EHR data for downstream healthcare tasks. The patient-diagnosis result is the more consequential of the two claims.

major comments (2)

[Experiments (node classification / link prediction subsection)] The headline 5-6× improvement on “concept similarity” is reported without an explicit definition of the metric or the train/test split used for link prediction / similarity evaluation. Because the random-walk and Poincaré models are trained precisely to recover SNOMED-CT is-a and attribute relations, any similarity or link-prediction score computed on the same graph necessarily favors the graph embeddings over EHR-derived baselines that never observe those relations. This alignment must be quantified (e.g., by reporting AUC on held-out SNOMED edges versus a non-graph baseline) before the claim of richer clinical semantics can be accepted.
[Patient state prediction experiments] The 6-20% improvement on patient diagnosis is presented without details on how the SNOMED embeddings are injected into the downstream model, whether the patient cohort overlaps with the SNOMED concepts used for training, or what statistical tests and data splits were employed. These controls are required to establish that the gain is attributable to the graph signal rather than to differences in feature dimensionality, preprocessing, or cohort construction.

minor comments (2)

[Abstract / Introduction] The abstract and introduction should cite the exact prior EHR embedding papers (e.g., Choi et al., 2016; Choi et al., 2017) that constitute the “state-of-the-art” baselines so that readers can verify implementation fidelity.
[Methods] Notation for the Poincaré model (hyperbolic distance, curvature parameter) and the random-walk hyperparameters (walk length, number of walks) should be stated explicitly in the methods section.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. The concerns about missing experimental details are valid, and we will revise the manuscript to address them directly while preserving the core claims.

read point-by-point responses

Referee: [Experiments (node classification / link prediction subsection)] The headline 5-6× improvement on “concept similarity” is reported without an explicit definition of the metric or the train/test split used for link prediction / similarity evaluation. Because the random-walk and Poincaré models are trained precisely to recover SNOMED-CT is-a and attribute relations, any similarity or link-prediction score computed on the same graph necessarily favors the graph embeddings over EHR-derived baselines that never observe those relations. This alignment must be quantified (e.g., by reporting AUC on held-out SNOMED edges versus a non-graph baseline) before the claim of richer clinical semantics can be accepted.

Authors: We agree the manuscript omits an explicit definition of the concept similarity metric and the precise train/test splits. The reported gains used a held-out subset of SNOMED-CT edges for link prediction and similarity ranking (with cosine similarity as the underlying measure), evaluated identically for all methods including the EHR baselines. To resolve the alignment concern we will add: (1) the exact metric definition and split ratios, (2) AUC on held-out SNOMED edges, and (3) results against a non-graph baseline (e.g., random or frequency-based) as suggested. These additions will appear in a new subsection of the experiments. revision: yes
Referee: [Patient state prediction experiments] The 6-20% improvement on patient diagnosis is presented without details on how the SNOMED embeddings are injected into the downstream model, whether the patient cohort overlaps with the SNOMED concepts used for training, or what statistical tests and data splits were employed. These controls are required to establish that the gain is attributable to the graph signal rather than to differences in feature dimensionality, preprocessing, or cohort construction.

Authors: The manuscript is missing these implementation details. SNOMED embeddings were injected as fixed feature vectors into a downstream logistic regression or neural network classifier for diagnosis prediction; the patient cohort was drawn from an independent EHR source with no overlap in the training instances. We will expand the patient-state section to specify the injection procedure, confirm cohort separation, report the train/validation/test splits, and include statistical tests (paired t-test and McNemar’s test) with p-values. This will be added as a dedicated paragraph with a new table of controls. revision: yes

Circularity Check

0 steps flagged

No significant circularity in empirical embedding evaluation

full rationale

The paper reports standard empirical comparisons of graph embedding methods (random walk, Poincaré) trained on the SNOMED-CT knowledge graph against EHR-derived baselines, with performance measured on node classification, link prediction, and patient diagnosis tasks. No derivation chain, first-principles result, or mathematical prediction is presented that reduces to its own inputs by construction. Link prediction and similarity metrics are conventional held-out evaluations for graph embeddings; the patient diagnosis task provides an external downstream benchmark independent of the training graph structure. Any self-citations are not load-bearing for the central empirical claims.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No free parameters, axioms, or invented entities are identifiable from the abstract alone.

pith-pipeline@v0.9.0 · 5721 in / 1077 out tokens · 32189 ms · 2026-05-24T19:03:28.671232+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost (CostAlphaLog, Jcost) costAlphaLog_fourth_deriv_at_zero; dAlembert_to_ODE_general echoes

?

echoes
ECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.

The distance between two points u and v within the Poincaré ball is given as d(u,v) = cosh^{-1}(1 + 2 ||u −v ||² / ((1 − || u||²)(1 − ||v ||²))) ... loss function ... log e^{-d(u,v)} / sum e^{-d(u,v1)}

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

15 extracted references · 15 canonical work pages · 2 internal anchors

[1]

Clinical concept embeddings learned from massive sources of medical data.arXiv preprint arXiv:1804.01486, 2018

A L Beam, B Kompa, I Fried, N P Palmer, X Shi, T Cai, and I S Kohane. Clinical concept embeddings learned from massive sources of medical data.arXiv preprint arXiv:1804.01486, 2018

work page arXiv 2018
[2]

Multi-layer representation learning for medical concepts

Edward Choi, Mohammad Taha Bahadori, Elizabeth Searles, Catherine Coffey, Michael Thompson, James Bost, Javier Tejedor-Sojo, and Jimeng Sun. Multi-layer representation learning for medical concepts. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , pp. 1495–1504. ACM, 2016

work page 2016
[3]

Gram: graph-based attention model for healthcare representation learning

Edward Choi, Mohammad Taha Bahadori, Le Song, Walter F Stewart, and Jimeng Sun. Gram: graph-based attention model for healthcare representation learning. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , pp. 787–795. ACM, 2017

work page 2017
[4]

Medical semantic similarity with a neural language model

Lance De Vine, Guido Zuccon, Bevan Koopman, Laurianne Sitbon, and Peter Bruza. Medical semantic similarity with a neural language model. In Proceed- ings of the 23rd ACM international conference on conference on information and knowledge management, pp. 1819–1822. ACM, 2014

work page 2014
[5]

Embedding Text in Hyperbolic Spaces

B Dhingra, C J Shallue, M Norouzi, A M Dai, and G E Dahl. Embedding text in hyperbolic spaces. arXiv preprint arXiv:1806.04313, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[6]

metapath2vec: Scalable representation learning for heterogeneous networks

Yuxiao Dong, Nitesh V Chawla, and Ananthram Swami. metapath2vec: Scalable representation learning for heterogeneous networks. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining , pp. 135–144. ACM, 2017

work page 2017
[7]

SNOMED-CT: The advanced terminology and coding system for eHealth

K Donnelly. SNOMED-CT: The advanced terminology and coding system for eHealth. Studies in health technology and informatics , 121:279, 2006

work page 2006
[8]

node2vec: Scalable feature learning for net- works

Aditya Grover and Jure Leskovec. node2vec: Scalable feature learning for net- works. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining , pp. 855–864. ACM, 2016

work page 2016
[9]

MIMIC-III, a freely accessible critical care database

Shen L Lehman L Feng M Ghassemi M Moody B Szolovits P Celi LA John- son AEW, Pollard TJ and Mark RG. MIMIC-III, a freely accessible critical care database. Scientific Data (2016). DOI: 10.1038/sdata.2016.35. A vailable at: http://www.nature.com/articles/sdata201635, 2016

work page doi:10.1038/sdata.2016.35 2016
[10]

The UMLS semantic network

Alexa T McCray. The UMLS semantic network. In Proceedings. Symposium on Computer Applications in Medical Care , pp. 503–507. American Medical Informat- ics Association, 1989

work page 1989
[11]

Applying deep learning techniques on medical corpora from the World Wide Web: a prototypical system and evaluation

J A Miñarro-Giménez, O Marín-Alonso, and M Samwald. Applying deep learning techniques on medical corpora from the World Wide Web: a prototypical system and evaluation. CoRR, abs/1502.03682, 2015. URL http://arxiv.org/abs/1502.03682

work page internal anchor Pith review Pith/arXiv arXiv 2015
[12]

Poincaré embeddings for learning hierar- chical representations

Maximillian Nickel and Douwe Kiela. Poincaré embeddings for learning hierar- chical representations. In Advances in neural information processing systems , pp. 6338–6347, 2017

work page 2017
[13]

Deepwalk: Online learning of social representations

Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining , pp. 701–710. ACM, 2014

work page 2014
[14]

Software Framework for Topic Modelling with Large Corpora

R Řehůřek and P Sojka. Software Framework for Topic Modelling with Large Corpora. In LREC Workshop on New Challenges for NLP Frameworks , 2010

work page 2010
[15]

Link prediction based on graph neural networks

Muhan Zhang and Yixin Chen. Link prediction based on graph neural networks. In Advances in Neural Information Processing Systems , pp. 5165–5175, 2018. 5

work page 2018

[1] [1]

Clinical concept embeddings learned from massive sources of medical data.arXiv preprint arXiv:1804.01486, 2018

A L Beam, B Kompa, I Fried, N P Palmer, X Shi, T Cai, and I S Kohane. Clinical concept embeddings learned from massive sources of medical data.arXiv preprint arXiv:1804.01486, 2018

work page arXiv 2018

[2] [2]

Multi-layer representation learning for medical concepts

Edward Choi, Mohammad Taha Bahadori, Elizabeth Searles, Catherine Coffey, Michael Thompson, James Bost, Javier Tejedor-Sojo, and Jimeng Sun. Multi-layer representation learning for medical concepts. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , pp. 1495–1504. ACM, 2016

work page 2016

[3] [3]

Gram: graph-based attention model for healthcare representation learning

Edward Choi, Mohammad Taha Bahadori, Le Song, Walter F Stewart, and Jimeng Sun. Gram: graph-based attention model for healthcare representation learning. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , pp. 787–795. ACM, 2017

work page 2017

[4] [4]

Medical semantic similarity with a neural language model

Lance De Vine, Guido Zuccon, Bevan Koopman, Laurianne Sitbon, and Peter Bruza. Medical semantic similarity with a neural language model. In Proceed- ings of the 23rd ACM international conference on conference on information and knowledge management, pp. 1819–1822. ACM, 2014

work page 2014

[5] [5]

Embedding Text in Hyperbolic Spaces

B Dhingra, C J Shallue, M Norouzi, A M Dai, and G E Dahl. Embedding text in hyperbolic spaces. arXiv preprint arXiv:1806.04313, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[6] [6]

metapath2vec: Scalable representation learning for heterogeneous networks

Yuxiao Dong, Nitesh V Chawla, and Ananthram Swami. metapath2vec: Scalable representation learning for heterogeneous networks. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining , pp. 135–144. ACM, 2017

work page 2017

[7] [7]

SNOMED-CT: The advanced terminology and coding system for eHealth

K Donnelly. SNOMED-CT: The advanced terminology and coding system for eHealth. Studies in health technology and informatics , 121:279, 2006

work page 2006

[8] [8]

node2vec: Scalable feature learning for net- works

Aditya Grover and Jure Leskovec. node2vec: Scalable feature learning for net- works. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining , pp. 855–864. ACM, 2016

work page 2016

[9] [9]

MIMIC-III, a freely accessible critical care database

Shen L Lehman L Feng M Ghassemi M Moody B Szolovits P Celi LA John- son AEW, Pollard TJ and Mark RG. MIMIC-III, a freely accessible critical care database. Scientific Data (2016). DOI: 10.1038/sdata.2016.35. A vailable at: http://www.nature.com/articles/sdata201635, 2016

work page doi:10.1038/sdata.2016.35 2016

[10] [10]

The UMLS semantic network

Alexa T McCray. The UMLS semantic network. In Proceedings. Symposium on Computer Applications in Medical Care , pp. 503–507. American Medical Informat- ics Association, 1989

work page 1989

[11] [11]

Applying deep learning techniques on medical corpora from the World Wide Web: a prototypical system and evaluation

J A Miñarro-Giménez, O Marín-Alonso, and M Samwald. Applying deep learning techniques on medical corpora from the World Wide Web: a prototypical system and evaluation. CoRR, abs/1502.03682, 2015. URL http://arxiv.org/abs/1502.03682

work page internal anchor Pith review Pith/arXiv arXiv 2015

[12] [12]

Poincaré embeddings for learning hierar- chical representations

Maximillian Nickel and Douwe Kiela. Poincaré embeddings for learning hierar- chical representations. In Advances in neural information processing systems , pp. 6338–6347, 2017

work page 2017

[13] [13]

Deepwalk: Online learning of social representations

Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining , pp. 701–710. ACM, 2014

work page 2014

[14] [14]

Software Framework for Topic Modelling with Large Corpora

R Řehůřek and P Sojka. Software Framework for Topic Modelling with Large Corpora. In LREC Workshop on New Challenges for NLP Frameworks , 2010

work page 2010

[15] [15]

Link prediction based on graph neural networks

Muhan Zhang and Yixin Chen. Link prediction based on graph neural networks. In Advances in Neural Information Processing Systems , pp. 5165–5175, 2018. 5

work page 2018