pith. sign in

arxiv: 1907.08650 · v1 · pith:NJIQS2CEnew · submitted 2019-07-19 · 💻 cs.LG · cs.AI· stat.ML

Snomed2Vec: Random Walk and Poincar\'e Embeddings of a Clinical Knowledge Base for Healthcare Analytics

Pith reviewed 2026-05-24 19:03 UTC · model grok-4.3

classification 💻 cs.LG cs.AIstat.ML
keywords SNOMED-CTgraph embeddingsrandom walk embeddingsPoincaré embeddingsconcept similaritypatient diagnosis predictionhealthcare analyticsknowledge graph representation learning
0
0 comments X

The pith

Embeddings from the SNOMED-CT knowledge graph outperform those from electronic health records on medical tasks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper aims to establish that graph-based representation learning on the SNOMED-CT clinical ontology produces higher-quality embeddings for medical concepts than methods trained on patient records or claims data. A reader would care because improved embeddings support better performance in tasks such as identifying similar concepts and predicting patient diagnoses from limited data. The authors apply random walk and Poincaré embedding techniques to the knowledge graph and test them on node classification, link prediction, and patient state prediction. Their evaluation indicates clear advantages in these biomedical applications.

Core claim

Concept embeddings derived from the SNOMED-CT knowledge graph significantly outperform state-of-the-art embeddings, showing 5-6x improvement in concept similarity and 6-20% improvement in patient diagnosis.

What carries the argument

Random walk and Poincaré embeddings applied to the SNOMED-CT knowledge graph to produce vector representations of medical concepts.

If this is right

  • Improved accuracy on node classification tasks within the medical concept graph.
  • Stronger results on link prediction for relationships between medical concepts.
  • Better performance in predicting patient states using the learned embeddings.
  • More effective representations for downstream healthcare analytics models.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Knowledge graphs like SNOMED-CT may capture structured medical relationships that raw patient data alone does not provide as effectively.
  • This method could be adapted to other specialized domains with rich ontologies to improve embedding quality.
  • Patient-level predictions might benefit from combining graph-derived embeddings with other data sources in future models.

Load-bearing premise

The structure and content of the SNOMED-CT knowledge graph supply a signal that is both richer and more generalizable to patient-level prediction tasks than embeddings learned directly from electronic health records.

What would settle it

Running the patient diagnosis prediction task on an independent dataset and finding that the SNOMED-CT embeddings do not achieve the reported 6-20% improvement over baselines.

Figures

Figures reproduced from arXiv: 1907.08650 by Khushbu Agarwal, Raghavendra Addanki, Robert Rallo, Sutanay Choudhury, Suzanne Tamang, Tome Eftimov.

Figure 1
Figure 1. Figure 1: Schema of the subset of SNOMED-CT knowledge graph extracted from UMLS. ‘n’ denotes the number of concepts of [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Illustration of semantic types and rela [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Visualization of the SNOMED-X graph embeddings (d=500) learned by Node2vec (top left), Metapath2vec (middle) and Poincaré (right). The shape of the visualizations demonstrate the distinct method objective and embedding characteristics (Node2vec: neighbourhood correlations; Metapath2vec: distinct node types; Poincare: hierarchical relations) (2) Link prediction: We test the accuracy of learned embed￾dings i… view at source ↗
Figure 4
Figure 4. Figure 4: Architecture of the deep learning model to predict [PITH_FULL_IMAGE:figures/full_fig_p003_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Left to right: (a) Bootstrap distribution of cosine similarity for each method on D [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗
read the original abstract

Representation learning methods that transform encoded data (e.g., diagnosis and drug codes) into continuous vector spaces (i.e., vector embeddings) are critical for the application of deep learning in healthcare. Initial work in this area explored the use of variants of the word2vec algorithm to learn embeddings for medical concepts from electronic health records or medical claims datasets. We propose learning embeddings for medical concepts by using graph-based representation learning methods on SNOMED-CT, a widely popular knowledge graph in the healthcare domain with numerous operational and research applications. Current work presents an empirical analysis of various embedding methods, including the evaluation of their performance on multiple tasks of biomedical relevance (node classification, link prediction, and patient state prediction). Our results show that concept embeddings derived from the SNOMED-CT knowledge graph significantly outperform state-of-the-art embeddings, showing 5-6x improvement in ``concept similarity" and 6-20\% improvement in patient diagnosis.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes Snomed2Vec, which applies random-walk and Poincaré embedding methods to the SNOMED-CT knowledge graph to produce vector representations of medical concepts. It reports empirical results on node classification, link prediction, and patient-state prediction tasks, claiming that the resulting embeddings significantly outperform prior state-of-the-art embeddings learned from electronic health records or claims data via word2vec-style methods, with 5-6× gains on a concept-similarity metric and 6-20% gains on patient diagnosis.

Significance. If the reported gains survive controlled re-evaluation that removes the alignment between training graph and test metric, the work would supply a reusable clinical embedding resource and demonstrate the value of structured knowledge graphs over purely observational EHR data for downstream healthcare tasks. The patient-diagnosis result is the more consequential of the two claims.

major comments (2)
  1. [Experiments (node classification / link prediction subsection)] The headline 5-6× improvement on “concept similarity” is reported without an explicit definition of the metric or the train/test split used for link prediction / similarity evaluation. Because the random-walk and Poincaré models are trained precisely to recover SNOMED-CT is-a and attribute relations, any similarity or link-prediction score computed on the same graph necessarily favors the graph embeddings over EHR-derived baselines that never observe those relations. This alignment must be quantified (e.g., by reporting AUC on held-out SNOMED edges versus a non-graph baseline) before the claim of richer clinical semantics can be accepted.
  2. [Patient state prediction experiments] The 6-20% improvement on patient diagnosis is presented without details on how the SNOMED embeddings are injected into the downstream model, whether the patient cohort overlaps with the SNOMED concepts used for training, or what statistical tests and data splits were employed. These controls are required to establish that the gain is attributable to the graph signal rather than to differences in feature dimensionality, preprocessing, or cohort construction.
minor comments (2)
  1. [Abstract / Introduction] The abstract and introduction should cite the exact prior EHR embedding papers (e.g., Choi et al., 2016; Choi et al., 2017) that constitute the “state-of-the-art” baselines so that readers can verify implementation fidelity.
  2. [Methods] Notation for the Poincaré model (hyperbolic distance, curvature parameter) and the random-walk hyperparameters (walk length, number of walks) should be stated explicitly in the methods section.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. The concerns about missing experimental details are valid, and we will revise the manuscript to address them directly while preserving the core claims.

read point-by-point responses
  1. Referee: [Experiments (node classification / link prediction subsection)] The headline 5-6× improvement on “concept similarity” is reported without an explicit definition of the metric or the train/test split used for link prediction / similarity evaluation. Because the random-walk and Poincaré models are trained precisely to recover SNOMED-CT is-a and attribute relations, any similarity or link-prediction score computed on the same graph necessarily favors the graph embeddings over EHR-derived baselines that never observe those relations. This alignment must be quantified (e.g., by reporting AUC on held-out SNOMED edges versus a non-graph baseline) before the claim of richer clinical semantics can be accepted.

    Authors: We agree the manuscript omits an explicit definition of the concept similarity metric and the precise train/test splits. The reported gains used a held-out subset of SNOMED-CT edges for link prediction and similarity ranking (with cosine similarity as the underlying measure), evaluated identically for all methods including the EHR baselines. To resolve the alignment concern we will add: (1) the exact metric definition and split ratios, (2) AUC on held-out SNOMED edges, and (3) results against a non-graph baseline (e.g., random or frequency-based) as suggested. These additions will appear in a new subsection of the experiments. revision: yes

  2. Referee: [Patient state prediction experiments] The 6-20% improvement on patient diagnosis is presented without details on how the SNOMED embeddings are injected into the downstream model, whether the patient cohort overlaps with the SNOMED concepts used for training, or what statistical tests and data splits were employed. These controls are required to establish that the gain is attributable to the graph signal rather than to differences in feature dimensionality, preprocessing, or cohort construction.

    Authors: The manuscript is missing these implementation details. SNOMED embeddings were injected as fixed feature vectors into a downstream logistic regression or neural network classifier for diagnosis prediction; the patient cohort was drawn from an independent EHR source with no overlap in the training instances. We will expand the patient-state section to specify the injection procedure, confirm cohort separation, report the train/validation/test splits, and include statistical tests (paired t-test and McNemar’s test) with p-values. This will be added as a dedicated paragraph with a new table of controls. revision: yes

Circularity Check

0 steps flagged

No significant circularity in empirical embedding evaluation

full rationale

The paper reports standard empirical comparisons of graph embedding methods (random walk, Poincaré) trained on the SNOMED-CT knowledge graph against EHR-derived baselines, with performance measured on node classification, link prediction, and patient diagnosis tasks. No derivation chain, first-principles result, or mathematical prediction is presented that reduces to its own inputs by construction. Link prediction and similarity metrics are conventional held-out evaluations for graph embeddings; the patient diagnosis task provides an external downstream benchmark independent of the training graph structure. Any self-citations are not load-bearing for the central empirical claims.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No free parameters, axioms, or invented entities are identifiable from the abstract alone.

pith-pipeline@v0.9.0 · 5721 in / 1077 out tokens · 32189 ms · 2026-05-24T19:03:28.671232+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

15 extracted references · 15 canonical work pages · 2 internal anchors

  1. [1]

    Clinical concept embeddings learned from massive sources of medical data.arXiv preprint arXiv:1804.01486, 2018

    A L Beam, B Kompa, I Fried, N P Palmer, X Shi, T Cai, and I S Kohane. Clinical concept embeddings learned from massive sources of medical data.arXiv preprint arXiv:1804.01486, 2018

  2. [2]

    Multi-layer representation learning for medical concepts

    Edward Choi, Mohammad Taha Bahadori, Elizabeth Searles, Catherine Coffey, Michael Thompson, James Bost, Javier Tejedor-Sojo, and Jimeng Sun. Multi-layer representation learning for medical concepts. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , pp. 1495–1504. ACM, 2016

  3. [3]

    Gram: graph-based attention model for healthcare representation learning

    Edward Choi, Mohammad Taha Bahadori, Le Song, Walter F Stewart, and Jimeng Sun. Gram: graph-based attention model for healthcare representation learning. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , pp. 787–795. ACM, 2017

  4. [4]

    Medical semantic similarity with a neural language model

    Lance De Vine, Guido Zuccon, Bevan Koopman, Laurianne Sitbon, and Peter Bruza. Medical semantic similarity with a neural language model. In Proceed- ings of the 23rd ACM international conference on conference on information and knowledge management, pp. 1819–1822. ACM, 2014

  5. [5]

    Embedding Text in Hyperbolic Spaces

    B Dhingra, C J Shallue, M Norouzi, A M Dai, and G E Dahl. Embedding text in hyperbolic spaces. arXiv preprint arXiv:1806.04313, 2018

  6. [6]

    metapath2vec: Scalable representation learning for heterogeneous networks

    Yuxiao Dong, Nitesh V Chawla, and Ananthram Swami. metapath2vec: Scalable representation learning for heterogeneous networks. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining , pp. 135–144. ACM, 2017

  7. [7]

    SNOMED-CT: The advanced terminology and coding system for eHealth

    K Donnelly. SNOMED-CT: The advanced terminology and coding system for eHealth. Studies in health technology and informatics , 121:279, 2006

  8. [8]

    node2vec: Scalable feature learning for net- works

    Aditya Grover and Jure Leskovec. node2vec: Scalable feature learning for net- works. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining , pp. 855–864. ACM, 2016

  9. [9]

    MIMIC-III, a freely accessible critical care database

    Shen L Lehman L Feng M Ghassemi M Moody B Szolovits P Celi LA John- son AEW, Pollard TJ and Mark RG. MIMIC-III, a freely accessible critical care database. Scientific Data (2016). DOI: 10.1038/sdata.2016.35. A vailable at: http://www.nature.com/articles/sdata201635, 2016

  10. [10]

    The UMLS semantic network

    Alexa T McCray. The UMLS semantic network. In Proceedings. Symposium on Computer Applications in Medical Care , pp. 503–507. American Medical Informat- ics Association, 1989

  11. [11]

    Applying deep learning techniques on medical corpora from the World Wide Web: a prototypical system and evaluation

    J A Miñarro-Giménez, O Marín-Alonso, and M Samwald. Applying deep learning techniques on medical corpora from the World Wide Web: a prototypical system and evaluation. CoRR, abs/1502.03682, 2015. URL http://arxiv.org/abs/1502.03682

  12. [12]

    Poincaré embeddings for learning hierar- chical representations

    Maximillian Nickel and Douwe Kiela. Poincaré embeddings for learning hierar- chical representations. In Advances in neural information processing systems , pp. 6338–6347, 2017

  13. [13]

    Deepwalk: Online learning of social representations

    Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining , pp. 701–710. ACM, 2014

  14. [14]

    Software Framework for Topic Modelling with Large Corpora

    R Řehůřek and P Sojka. Software Framework for Topic Modelling with Large Corpora. In LREC Workshop on New Challenges for NLP Frameworks , 2010

  15. [15]

    Link prediction based on graph neural networks

    Muhan Zhang and Yixin Chen. Link prediction based on graph neural networks. In Advances in Neural Information Processing Systems , pp. 5165–5175, 2018. 5