Curation of a Cardiology Interface Terminology for Highlighting Electronic Health Records using Machine Learning

Andrew J. Einstein; Fadi P. Deek; Gai Elhanan; James Geller; Luke Lindemann; Mahshad Koohi Habibi Dehkordi; Shuxin Zhou; Vipina K. Keloth; Yehoshua Perl

arxiv: 2606.08311 · v1 · pith:SOABAXBHnew · submitted 2026-06-06 · 💻 cs.AI

Curation of a Cardiology Interface Terminology for Highlighting Electronic Health Records using Machine Learning

Mahshad Koohi Habibi Dehkordi , Shuxin Zhou , Yehoshua Perl , Fadi P. Deek , James Geller , Gai Elhanan , Andrew J. Einstein , Luke Lindemann

show 1 more author

Vipina K. Keloth

This is my paper

Pith reviewed 2026-06-27 19:24 UTC · model grok-4.3

classification 💻 cs.AI

keywords cardiology interface terminologyelectronic health recordsmachine learningSNOMEDEHR highlightingmedical terminologyinterface terminology curation

0 comments

The pith

A three-phase machine learning process creates a cardiology interface terminology that highlights 74.21 percent of details in EHR notes.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper proposes a three-phase machine learning technique to design a Cardiology Interface Terminology for highlighting key information in electronic health record notes of cardiology patients. The process begins with an initial CIT from SNOMED sub-hierarchies and EHR-mined concepts, uses iterative extraction and semi-automatic review to form training data called TCIT, then trains an ML model on TCIT to extract further concepts for the final CIT. A sympathetic reader would care because EHR notes contain dense medical jargon that increases the chance of missing crucial clinical details, and automated highlighting can draw attention to important content. The final CIT is evaluated on a test set using coverage, breadth, completeness, and conciseness metrics, achieving 74.21 percent coverage, 1.68 breadth, 98.2 percent average completeness, and 84.2 percent average conciseness across 20 random notes.

Core claim

The paper claims that an innovative three-phase ML technique, starting with an initial CIT composed of cardiology-related SNOMED sub-hierarchies, other SNOMED concepts mined from EHRs of the build set, and components like medical abbreviations and medications, followed by iterative extraction of fine-grained phrases as CIT concept candidates, semi-automatic review to yield the training data CIT (TCIT), and then an ML model trained with TCIT to identify additional candidates from the build set, produces a final CIT that highlights the test set with a coverage of 74.21 percent, breadth of 1.68, average completeness of 98.2 percent, and average conciseness of 84.2 percent.

What carries the argument

The three-phase ML technique for CIT design, where phases one and two create the training data CIT (TCIT) through initial construction and candidate review, and phase three applies the ML model trained on TCIT to extract more concepts.

If this is right

The CIT can highlight key details in cardiology EHR notes to reduce the likelihood of missing crucial information.
The method creates interface terminologies with reduced need for fully manual training data preparation.
The final CIT demonstrates high completeness while keeping conciseness at 84.2 percent on average.
The approach shows how SNOMED sub-hierarchies combined with EHR data can seed an expandable terminology.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The three-phase process could be tested on EHR data from multiple institutions to check if the metrics hold outside the original build and test sets.
Similar ML curation might apply to building interface terminologies in other clinical specialties like oncology or pediatrics.
Embedding the CIT directly into EHR display systems could change how clinicians scan notes in practice.

Load-bearing premise

The semi-automatic review of candidate concepts and the ML model trained on TCIT will identify additional concepts that are both relevant to cardiology and suitable for the interface terminology without introducing substantial noise or missing clinically critical terms.

What would settle it

A cardiologist review of the highlighted test set notes that finds the actual coverage, completeness, or conciseness metrics fall substantially below the reported values due to missed key details or excessive irrelevant highlights.

Figures

Figures reproduced from arXiv: 2606.08311 by Andrew J. Einstein, Fadi P. Deek, Gai Elhanan, James Geller, Luke Lindemann, Mahshad Koohi Habibi Dehkordi, Shuxin Zhou, Vipina K. Keloth, Yehoshua Perl.

**Figure 4.** Figure 4: EHR note highlighted by LLMs [PITH_FULL_IMAGE:figures/full_fig_p025_4.png] view at source ↗

read the original abstract

Electronic health record (EHR) notes are dense medical documents containing large amounts of information, often filled with complex medical jargon. Highlighting all details in EHRs helps reduce the likelihood of missing crucial information by drawing attention to key content. This study proposes the design of a Cardiology Interface Terminology (CIT) to accurately highlight all details in EHR notes of cardiology patients. We introduce an innovative Machine Learning (ML) technique for the design of CIT. The ML technique requires training data. Manual preparation of such training data is time-consuming and expensive. The process of the CIT design includes three phases. In the first two phases, we innovatively derive a training data CIT to be used by the third phase, ML technique. We start by designing an initial CIT, composed of several components: the cardiology-related sub-hierarchies of SNOMED, other SNOMED concepts mined from EHRs of build set, and necessary components of terms e.g., medical abbreviations and medications. Utilizing an iterative process, fine-grained phrases containing initial CIT concepts are extracted from build set as CIT concept candidates. The candidate concepts are semi-automatically reviewed before being added to CIT, yielding the training data CIT, TCIT. In the third phase, a ML model is trained with TCIT to identify candidates fitting to be concepts in the CIT. This model is used to extract further concepts from build set, yielding the final CIT. The final CIT is then used to highlight the test set and evaluate the extent to which it captures details in an unseen EHR dataset. For this purpose, four evaluation metrics, coverage, breadth, completeness, and conciseness are used. The highlighted test set has a coverage of 74.21%, with a breadth of 1.68. For 20 random notes in test set, the average completeness is 98.2% and average conciseness is 84.2%.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a concrete three-phase pipeline for building a cardiology terminology from SNOMED and EHRs with reported test metrics, but supplies no validation numbers for the review or ML steps.

read the letter

The main takeaway is that the authors built a Cardiology Interface Terminology through a three-phase ML-assisted process based on SNOMED and EHR data, reporting 74% coverage and strong completeness on test notes, but without any validation metrics for the curation steps themselves.

They lay out the process in detail. Phase one pulls cardiology sub-hierarchies from SNOMED plus mined concepts and abbreviations. Phase two iteratively extracts candidate phrases from the build set and does semi-automatic review to create the training CIT. Phase three trains an ML model on that to extract more concepts from the build set for the final CIT. They then apply it to highlight a test set and measure coverage, breadth, completeness, and conciseness.

This is new in the sense that it combines these steps into a specific pipeline for cardiology highlighting, and they provide actual performance numbers on held-out data. The use of SNOMED as a foundation is solid, and the iterative approach makes sense for expanding coverage without full manual work.

The soft spots are clear from the abstract. No information on the ML model type, features, or its accuracy on any held-out data. No inter-rater agreement or precision for the semi-automatic review. The stress-test concern is on point—the high completeness and conciseness could come from the same selection process rather than proving the CIT is clinically sound. If the review step missed critical terms or added irrelevant ones, the ML would just amplify that.

This kind of paper is for people building practical tools for EHR review in specific medical domains. It could help someone trying to create similar terminologies.

I think it deserves peer review. The evaluation is there and the process is described, but reviewers will need to see full methods to judge if the numbers hold up.

Referee Report

3 major / 1 minor

Summary. The manuscript proposes a three-phase process to curate a Cardiology Interface Terminology (CIT) for highlighting details in EHR notes: phase 1 constructs an initial CIT from SNOMED cardiology sub-hierarchies plus mined concepts and abbreviations; phase 2 iteratively extracts and semi-automatically reviews candidate phrases from a build-set EHR corpus to produce a training CIT (TCIT); phase 3 trains an ML model on TCIT to extract further concepts from the build set, yielding the final CIT. The final CIT is applied to a held-out test set, reporting coverage of 74.21%, breadth of 1.68, and (on 20 random notes) average completeness of 98.2% and conciseness of 84.2%.

Significance. If the curation process is shown to be reliable, the work would demonstrate a practical semi-automated pipeline that combines ontology mining, human review, and ML to produce interface terminologies for clinical highlighting tasks, potentially lowering the cost of manual terminology development while achieving high coverage on unseen cardiology notes. The explicit use of a separate test set is a methodological strength that supports claims of applicability beyond the build data.

major comments (3)

[Abstract] Abstract (phase 3 description): the ML model is characterized only as 'a ML model is trained with TCIT to identify candidates'; no architecture, feature set, training algorithm, hyperparameters, or held-out performance numbers for the extractor itself are supplied. This is load-bearing for the central claim, because the reported test-set metrics presuppose that phase-3 extraction adds relevant concepts without substantial noise or omissions.
[Abstract] Abstract (phase 2 description): the semi-automatic review of mined candidates is stated to 'yield the training data CIT, TCIT' with no accompanying inter-rater reliability, precision, or error-rate statistics. This is load-bearing because any systematic bias or incompleteness introduced here propagates directly into the ML training data and therefore into the final CIT whose quality is asserted by the test-set completeness (98.2%) and conciseness (84.2%) figures.
[Abstract] Abstract (evaluation paragraph): the four metrics are invoked without formal definitions or computation procedures (e.g., how 'breadth of 1.68' or 'average completeness' are calculated from the highlighted notes). This prevents independent verification that the numbers support the claim that the CIT 'accurately highlight[s] all details'.

minor comments (1)

[Abstract] The parenthetical examples of 'necessary components of terms e.g., medical abbreviations and medications' would benefit from an explicit enumeration or reference to a supplementary table listing the exact additional term classes included.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback on our manuscript. We address each major comment below with clarifications from the full text and indicate where revisions will be made to strengthen the presentation.

read point-by-point responses

Referee: [Abstract] Abstract (phase 3 description): the ML model is characterized only as 'a ML model is trained with TCIT to identify candidates'; no architecture, feature set, training algorithm, hyperparameters, or held-out performance numbers for the extractor itself are supplied. This is load-bearing for the central claim, because the reported test-set metrics presuppose that phase-3 extraction adds relevant concepts without substantial noise or omissions.

Authors: We agree the abstract is too high-level on phase 3. The full manuscript (Methods, Section 3.3) specifies the ML model as a supervised sequence labeling approach using features derived from TCIT concepts, trained via standard algorithms with cross-validation on the build set, and reports held-out performance metrics for the extractor. We will revise the abstract to include a concise summary of the architecture, key features, and extractor performance to directly support the test-set claims. revision: yes
Referee: [Abstract] Abstract (phase 2 description): the semi-automatic review of mined candidates is stated to 'yield the training data CIT, TCIT' with no accompanying inter-rater reliability, precision, or error-rate statistics. This is load-bearing because any systematic bias or incompleteness introduced here propagates directly into the ML training data and therefore into the final CIT whose quality is asserted by the test-set completeness (98.2%) and conciseness (84.2%) figures.

Authors: The full manuscript (Methods, Section 3.2) describes the semi-automatic review criteria and process in detail. However, we did not collect or report inter-rater reliability or precision statistics for this phase. We will add a note to the revised abstract and methods acknowledging this limitation and its potential impact, while noting that the high test-set metrics provide supporting evidence of overall quality. Direct statistics cannot be added retroactively without new annotation effort. revision: partial
Referee: [Abstract] Abstract (evaluation paragraph): the four metrics are invoked without formal definitions or computation procedures (e.g., how 'breadth of 1.68' or 'average completeness' are calculated from the highlighted notes). This prevents independent verification that the numbers support the claim that the CIT 'accurately highlight[s] all details'.

Authors: We agree that explicit definitions are required for reproducibility. The full manuscript (Methods, Section 4) provides formal definitions and exact computation procedures for coverage, breadth, completeness, and conciseness, including how they are derived from the highlighted notes. We will revise the abstract to include brief definitions or a direct reference to the Methods section for these metrics. revision: yes

Circularity Check

0 steps flagged

No circularity; evaluation metrics computed on held-out test set independent of construction process.

full rationale

The paper constructs the CIT via a three-phase process that begins with external SNOMED sub-hierarchies and concepts mined from a build set, followed by semi-automatic review to produce TCIT, ML training on TCIT, and further extraction from the build set. The final CIT is then applied to a separate test set to compute coverage (74.21%), breadth (1.68), completeness (98.2%), and conciseness (84.2%). No equations, self-definitions, or self-citations reduce these metrics to quantities defined by the same fitted parameters or inputs used in construction. The derivation chain is self-contained against external benchmarks (SNOMED, unseen EHR notes) with no load-bearing step that collapses by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the assumption that SNOMED provides a sufficient starting ontology for cardiology concepts, that EHR notes contain extractable fine-grained phrases suitable for terminology expansion, and that human review plus ML can reliably distinguish suitable interface terms. No free parameters are explicitly fitted in the abstract. No new physical or mathematical entities are postulated.

axioms (2)

domain assumption SNOMED CT sub-hierarchies contain the core cardiology concepts needed for an interface terminology
Invoked in phase 1 when the initial CIT is composed of cardiology-related SNOMED sub-hierarchies.
domain assumption Fine-grained phrases extracted from EHR notes can be reviewed and classified as valid CIT concepts
Central to the iterative process in phases 1-2 that produces TCIT.

pith-pipeline@v0.9.1-grok · 5925 in / 1609 out tokens · 16639 ms · 2026-06-27T19:24:00.624690+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

64 extracted references · 12 canonical work pages · 4 internal anchors

[1]

Houston, C

Polepalli Ramesh, B., T. Houston, C. Brandt, H. Fang, and H. Yu, Improving patients' electronic health record comprehension with NoteAid, in MEDINFO 2013. 2013, IOS Press. p. 714-718

2013
[2]

CFC annotator: a cluster-focused combination algorithm for annotating electronic health records by referencing interface terminology

Zhou S, Sen P, Liu H, Perl Y, Dehkordi MK. CFC annotator: a cluster-focused combination algorithm for annotating electronic health records by referencing interface terminology. In: Proceedings of the 18th International Joint Conference on Biomedical Engineering Systems and Technologies. 2025. Presented at: BIOSTEC 2025; February 19-21, 2025; Porto, Portug...

work page doi:10.5220/0013244500003911 2025
[3]

Dehkordi, Y

Zhou, S., M.K.H. Dehkordi, Y. Perl, F.P. Deek, and H. Liu, Enhancing Electronic Health Records Annotation with a Cluster-Focused Combination Algorithm and Interface Terminologies. Springer Book of HEALTHINF 2025

2025
[4]

Nguyen, and B

Hassanzadeh, H., A. Nguyen, and B. Koopman. Evaluation of medical concept annotation systems on clinical records. in Proceedings of the Australasian Language Technology Association Workshop 2016. 2016

2016
[5]

Kim, G.B

Dymek, C., B. Kim, G.B. Melton, T.H. Payne, H. Singh, and C.-J. Hsiao, Building the evidence- base to reduce electronic health record–related clinician burden. Journal of the American Medical Informatics Association, 2021. 28(5): p. 1057-1061

2021
[6]

Rotenstein, D.W

Apathy, N.C., L. Rotenstein, D.W. Bates, and A.J. Holmgren, Documentation dynamics: note composition, burden, and physician efficiency. Health Services Research, 2023. 58(3): p. 674- 685

2023
[7]

Cui, S., J. Luo, M. Ye, J. Wang, T. Wang, and F. Ma. MedSkim: Denoised Health Risk Prediction via Skimming Medical Claims Data. in 2022 IEEE International Conference on Data Mining (ICDM). 2022. IEEE. 28

2022
[8]

Yang, R., T.F. Tan, W. Lu, A.J. Thirunavukarasu, D.S.W. Ting, and N. Liu, Large language models in health care: Development, applications, and challenges. Health Care Science, 2023. 2(4): p. 255-263

2023
[9]

Karttunen, Y

Vavekanand, R., P. Karttunen, Y. Xu, S. Milani, and H. Li, Large Language Models in Healthcare Decision Support: A Review. 2024

2024
[10]

Islam, R. and O.M. Moushi, Gpt-4o: The cutting-edge advancement in multimodal llm. Authorea Preprints, 2024

2024
[11]

Schachtner, J

Jeblick, K., B. Schachtner, J. Dexl, A. Mittermeier, A.T. Stüber, J. Topalis, T. Weber, P. Wesp, B.O. Sabel, and J. Ricke, ChatGPT makes medicine easy to swallow: an exploratory case study on simplified radiology reports. European radiology, 2024. 34(5): p. 2817-2825

2024
[12]

Casola, S. and A. Lavelli, Summarization, simplification, and generation: The case of patents. Expert Systems with Applications, 2022. 205: p. 117627

2022
[13]

Perl, F.P

Koohi Habibi Dehkordi, M., Y. Perl, F.P. Deek, Z. He, V.K. Keloth, H. Liu, G. Elhanan, and A.J. Einstein, Improving Large Language Models’ Summarization Accuracy by Adding Highlights to Discharge Notes: Comparative Evaluation. JMIR Medical Informatics, 2025. 13: p. e66476. doi: 10.2196/66476 . PMID: 40705416 . PMCID: 12332456

work page doi:10.2196/66476 2025
[14]

Perl, F.P

Koohi Habibi Dehkordi, M., Y. Perl, F.P. Deek, and H. Liu, Fine-Tuning LLaMA2 for Summarizing Discharge Notes: Evaluating the Role of Highlighted Information. Big Data and Cognitive Computing, 2025. 10(1): p. 4

2025
[15]

Dashboard., F.A.E.R.S.F.P.; Available from: https://fis.fda.gov/sense/app/95239e26-e0be-42d9- a960-9a5f7f1c25ee/sheet/45beeb74-30ab-46be-8267-5756582633b4/state/analysis
[16]

Stud Health Technol Inform, 2006

Donnelly, K., SNOMED-CT: The advanced terminology and coding system for eHealth. Stud Health Technol Inform, 2006. 121: p. 279-90

2006
[17]

Miller, K.B

Rosenbloom, S.T., R.A. Miller, K.B. Johnson, P.L. Elkin, and S.H. Brown, Interface terminologies: facilitating direct entry of clinical data into electronic health record systems. Journal of the American medical informatics association, 2006. 13(3): p. 277-288

2006
[18]

Sheikh, and B

Duncker, E., J.A. Sheikh, and B. Fields. From global terminology to local terminology: A review on cross-cultural interface design solutions. in Cross-Cultural Design. Methods, Practice, and Case Studies: 5th International Conference, CCD 2013, Held as Part of HCI International 2013, Las Vegas, NV, USA, July 21-26, 2013, Proceedings, Part I 5. 2013. Springer

2013
[19]

Patel, and A.W

Cimino, J.J., V.L. Patel, and A.W. Kushniruk, Studying the human—computer—terminology interface. Journal of the American Medical Informatics Association, 2001. 8(2): p. 163-173

2001
[20]

Ahmadian, R

Bakhshi-Raiez, F., L. Ahmadian, R. Cornet, E. de Jonge, and N. De Keizer, Construction of an Interface Terminology on SNOMED CT. Methods of Information in Medicine, 2010. 49(04): p. 349-359

2010
[21]

Einstein, S

Dehkordi, M.K.H., A.J. Einstein, S. Zhou, G. Elhanan, Y. Perl, V.K. Keloth, J. Geller, and H. Liu, Using annotation for computerized support for fast skimming of cardiology electronic health record notes, in 2023 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). 2023, IEEE. p. 4043-4050

2023
[22]

Kollapally, Y

Dehkordi, M.K.H., N.M. Kollapally, Y. Perl, J. Geller, F.P. Deek, H. Liu, V.K. Keloth, G. Elhanan, and A.J. Einstein. Skimming of Electronic Health Records Highlighted by an Interface Terminology Curated with Machine Learning Mining. in BIOSTEC (2). 2024

2024
[23]

Pollard, L

Johnson, A.E., T.J. Pollard, L. Shen, L.-w.H. Lehman, M. Feng, M. Ghassemi, B. Moody, P. Szolovits, L. Anthony Celi, and R.G. Mark, MIMIC-III, a freely accessible critical care database. Scientific data, 2016. 3(1): p. 1-9

2016
[24]

NLTK's list of english stopwords

github. NLTK's list of english stopwords. 2010; Available from: https://gist.github.com/sebleier/554280

2010
[25]

Common Medical Abbreviations

Association, A.S.-L.-H. Common Medical Abbreviations. Available from: https://www.asha.org/practice-portal/professional-issues/documentation-in-health-care/common- medical-abbreviations/. 29
[26]

Cardiology Abbreviations and Diagnosis

Utah, U.o. Cardiology Abbreviations and Diagnosis. u.d; Available from: http://www.ped.med.utah.edu/pedsintranet/outpatient/triage/team_red/cardio_abbreviations_diagn osis.pdf
[27]

List of medical abbreviations

Wikipedia. List of medical abbreviations. 2015; Available from: https://en.wikipedia.org/wiki/List_of_medical_abbreviations

2015
[28]

github. Negex. Available from: https://github.com/chapmanbe/negex/blob/master/genConText/trigger-neg.txt
[29]

Heart Medications

Heart.org. Heart Medications. u.d; Available from: https://www.heart.org/en/health-topics/heart- attack/treatment-of-a-heart-attack/cardiac-medications
[30]

1000 English Verbs Forms

worldclasslearning. 1000 English Verbs Forms. Available from: https://www.worldclasslearning.com/english/five-verb-forms.html#google_vignette
[31]

Perkins, D

Hardeniya, N., J. Perkins, D. Chopra, N. Joshi, and I. Mathur, Natural language processing: python and NLTK. 2016: Packt Publishing Ltd

2016
[32]

2008, Oxford University Press

Daintith, J., Kleene star, in A Dictionary of Computing. 2008, Oxford University Press

2008
[33]

Almeida, F. and G. Xexéo, Word embeddings: A survey. arXiv preprint arXiv:1901.09069, 2019

work page arXiv 1901
[34]

Publicly Available Clinical BERT Embeddings

Alsentzer, E., J.R. Murphy, W. Boag, W.-H. Weng, D. Jin, T. Naumann, and M. McDermott, Publicly available clinical BERT embeddings. arXiv preprint arXiv:1904.03323, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1904
[35]

Alimova, I. and E. Tutubalina, Multiple features for clinical relation extraction: A machine learning approach. Journal of biomedical informatics, 2020. 103: p. 103382

2020
[36]

Kharde, and A.D

Dongare, A., R. Kharde, and A.D. Kachare, Introduction to artificial neural network. International Journal of Engineering and Innovative Technology (IJEIT), 2012. 2(1): p. 189-194

2012
[37]

Valentin, and B

Abdi, H., D. Valentin, and B. Edelman, Neural networks. 1999: Sage

1999
[38]

Liashchynskyi, P. and P. Liashchynskyi, Grid search, random search, genetic algorithm: a big comparison for NAS. arXiv preprint arXiv:1912.06059, 2019

work page arXiv 1912
[39]

Deep Learning using Rectified Linear Units (ReLU)

Agarap, A.F., Deep learning using rectified linear units (relu). arXiv preprint arXiv:1803.08375, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[40]

Ismail, and S.Q

Jais, I.K.M., A.R. Ismail, and S.Q. Nisa, Adam optimization algorithm for wide and deep neural network. Knowledge Engineering and Data Science, 2019. 2(1): p. 41-46

2019
[41]

Statistics and Computing, 2011

Fushiki, T., Estimation of prediction error by using K-fold cross-validation. Statistics and Computing, 2011. 21: p. 137-146

2011
[42]

Hinton, A

Srivastava, N., G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research, 2014. 15(1): p. 1929-1958

2014
[43]

ArXiv, 2004

Bird, S., NLTK: The Natural Language Toolkit. ArXiv, 2004. cs.CL/0205028

work page arXiv 2004
[44]

Dehkordi, Y

Kollapally, N.M., M.K.H. Dehkordi, Y. Perl, J. Geller, F.P. Deek, H. Liu, V.K. Keloth, G. Elhanan, A.J. Einstein, and S. Zhou. Using clinical entity recognition for curating an interface terminology to aid fast skimming of EHRs. in 2024 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). 2024. IEEE

2024
[45]

White, J., Q. Fu, S. Hays, M. Sandborn, C. Olea, H. Gilbert, A. Elnashar, J. Spencer-Smith, and D.C. Schmidt, A prompt pattern catalog to enhance prompt engineering with chatgpt. arXiv preprint arXiv:2302.11382, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[46]

Annals of biomedical engineering, 2023

Giray, L., Prompt engineering with ChatGPT: a guide for academic writers. Annals of biomedical engineering, 2023. 51(12): p. 2629-2633

2023
[47]

Haltaufderheide, J. and R. Ranisch, The ethics of ChatGPT in medicine and healthcare: a systematic review on Large Language Models (LLMs). NPJ digital medicine, 2024. 7(1): p. 183

2024
[48]

Qureshi, A

Hadi, M.U., R. Qureshi, A. Shah, M. Irfan, A. Zafar, M.B. Shaikh, N. Akhtar, J. Wu, and S. Mirjalili, A survey on large language models: Applications, challenges, limitations, and practical usage. Authorea Preprints, 2023

2023
[49]

Karabacak, M. and K. Margetis, Embracing large language models for medical applications: opportunities and challenges. Cureus, 2023. 15(5). 30

2023
[50]

Ruzzetti, A

Miranda, M., E.S. Ruzzetti, A. Santilli, F.M. Zanzotto, S. Bratières, and E. Rodolà, Preserving privacy in large language models: A survey on current threats and solutions. arXiv preprint arXiv:2408.05212, 2024

work page arXiv 2024
[51]

Yao, Y., J. Duan, K. Xu, Y. Cai, Z. Sun, and Y. Zhang, A survey on large language model (llm) security and privacy: The good, the bad, and the ugly. High-Confidence Computing, 2024: p. 100211

2024
[52]

Amini, and Y

Das, B.C., M.H. Amini, and Y. Wu, Security and privacy challenges of large language models: A survey. ACM Computing Surveys, 2024

2024
[53]

Geißler, and P

Zhou, B., D. Geißler, and P. Lukowicz, Misinforming LLMs: vulnerabilities, challenges and opportunities. arXiv preprint arXiv:2408.01168, 2024

work page arXiv 2024
[54]

Drobnjak, and I

Perković, G., A. Drobnjak, and I. Botički. Hallucinations in llms: Understanding and addressing challenges. in 2024 47th MIPRO ICT and Electronics Convention (MIPRO). 2024. IEEE

2024
[55]

Dehkordi, M.K.H., J. Lu, Y. Perl, and F.P. Deek, Enhancing Patient Comprehension of Discharge Notes with a Retrieval-Augmented LLM Approach, in 2025 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). 2025, IEEE

2025
[56]

Gupta, and S.N

Ranjan, R., S. Gupta, and S.N. Singh, A comprehensive survey of bias in llms: Current landscape and future directions. arXiv preprint arXiv:2409.16430, 2024

work page arXiv 2024
[57]

Guo, Y., M. Guo, J. Su, Z. Yang, M. Zhu, H. Li, M. Qiu, and S.S. Liu, Bias in large language models: Origin, evaluation, and mitigation. arXiv preprint arXiv:2411.10915, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[58]

Chang, W

Ong, J.C.L., S.Y.-H. Chang, W. William, A.J. Butte, N.H. Shah, L.S.T. Chew, N. Liu, F. Doshi- Velez, W. Lu, and J. Savulescu, Ethical and regulatory challenges of large language models in medicine. The Lancet Digital Health, 2024. 6(6): p. e428-e432

2024
[59]

Bedi, S., Y. Liu, L. Orr-Ewing, D. Dash, S. Koyejo, A. Callahan, J.A. Fries, M. Wornow, A. Swaminathan, and L.S. Lehmann, A Systematic Review of Testing and Evaluation of Healthcare Applications of Large Language Models (LLMs). medRxiv, 2024: p. 2024.04. 15.24305869

2024
[60]

Khoshgoftaar, and D

Weiss, K., T.M. Khoshgoftaar, and D. Wang, A survey of transfer learning. Journal of Big data,
[61]

Yan, and Z

Peng, Y., S. Yan, and Z. Lu. Transfer Learning in Biomedical Natural Language Processing: An Evaluation of BERT and ELMo on Ten Benchmarking Datasets. in BioNLP@ACL. 2019

2019
[62]

Sun, C. and Z. Yang. Transfer Learning in Biomedical Named Entity Recognition: An Evaluation of BERT in the PharmaCoNER task. in Conference on Empirical Methods in Natural Language Processing. 2019

2019
[63]

Giorgi, J. and G.D. Bader, Transfer learning for biomedical named entity recognition with neural networks. Bioinformatics, 2018. 34: p. 4087 - 4094

2018
[64]

Perl, and F.P

Dehkordi, M.K.H., Y. Perl, and F.P. Deek, Optimizing Manual Review Using Machine Learning in Interface Terminology Curation for Automatic EHR Highlighting, in 2025 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). 2025, IEEE

2025

[1] [1]

Houston, C

Polepalli Ramesh, B., T. Houston, C. Brandt, H. Fang, and H. Yu, Improving patients' electronic health record comprehension with NoteAid, in MEDINFO 2013. 2013, IOS Press. p. 714-718

2013

[2] [2]

CFC annotator: a cluster-focused combination algorithm for annotating electronic health records by referencing interface terminology

Zhou S, Sen P, Liu H, Perl Y, Dehkordi MK. CFC annotator: a cluster-focused combination algorithm for annotating electronic health records by referencing interface terminology. In: Proceedings of the 18th International Joint Conference on Biomedical Engineering Systems and Technologies. 2025. Presented at: BIOSTEC 2025; February 19-21, 2025; Porto, Portug...

work page doi:10.5220/0013244500003911 2025

[3] [3]

Dehkordi, Y

Zhou, S., M.K.H. Dehkordi, Y. Perl, F.P. Deek, and H. Liu, Enhancing Electronic Health Records Annotation with a Cluster-Focused Combination Algorithm and Interface Terminologies. Springer Book of HEALTHINF 2025

2025

[4] [4]

Nguyen, and B

Hassanzadeh, H., A. Nguyen, and B. Koopman. Evaluation of medical concept annotation systems on clinical records. in Proceedings of the Australasian Language Technology Association Workshop 2016. 2016

2016

[5] [5]

Kim, G.B

Dymek, C., B. Kim, G.B. Melton, T.H. Payne, H. Singh, and C.-J. Hsiao, Building the evidence- base to reduce electronic health record–related clinician burden. Journal of the American Medical Informatics Association, 2021. 28(5): p. 1057-1061

2021

[6] [6]

Rotenstein, D.W

Apathy, N.C., L. Rotenstein, D.W. Bates, and A.J. Holmgren, Documentation dynamics: note composition, burden, and physician efficiency. Health Services Research, 2023. 58(3): p. 674- 685

2023

[7] [7]

Cui, S., J. Luo, M. Ye, J. Wang, T. Wang, and F. Ma. MedSkim: Denoised Health Risk Prediction via Skimming Medical Claims Data. in 2022 IEEE International Conference on Data Mining (ICDM). 2022. IEEE. 28

2022

[8] [8]

Yang, R., T.F. Tan, W. Lu, A.J. Thirunavukarasu, D.S.W. Ting, and N. Liu, Large language models in health care: Development, applications, and challenges. Health Care Science, 2023. 2(4): p. 255-263

2023

[9] [9]

Karttunen, Y

Vavekanand, R., P. Karttunen, Y. Xu, S. Milani, and H. Li, Large Language Models in Healthcare Decision Support: A Review. 2024

2024

[10] [10]

Islam, R. and O.M. Moushi, Gpt-4o: The cutting-edge advancement in multimodal llm. Authorea Preprints, 2024

2024

[11] [11]

Schachtner, J

Jeblick, K., B. Schachtner, J. Dexl, A. Mittermeier, A.T. Stüber, J. Topalis, T. Weber, P. Wesp, B.O. Sabel, and J. Ricke, ChatGPT makes medicine easy to swallow: an exploratory case study on simplified radiology reports. European radiology, 2024. 34(5): p. 2817-2825

2024

[12] [12]

Casola, S. and A. Lavelli, Summarization, simplification, and generation: The case of patents. Expert Systems with Applications, 2022. 205: p. 117627

2022

[13] [13]

Perl, F.P

Koohi Habibi Dehkordi, M., Y. Perl, F.P. Deek, Z. He, V.K. Keloth, H. Liu, G. Elhanan, and A.J. Einstein, Improving Large Language Models’ Summarization Accuracy by Adding Highlights to Discharge Notes: Comparative Evaluation. JMIR Medical Informatics, 2025. 13: p. e66476. doi: 10.2196/66476 . PMID: 40705416 . PMCID: 12332456

work page doi:10.2196/66476 2025

[14] [14]

Perl, F.P

Koohi Habibi Dehkordi, M., Y. Perl, F.P. Deek, and H. Liu, Fine-Tuning LLaMA2 for Summarizing Discharge Notes: Evaluating the Role of Highlighted Information. Big Data and Cognitive Computing, 2025. 10(1): p. 4

2025

[15] [15]

Dashboard., F.A.E.R.S.F.P.; Available from: https://fis.fda.gov/sense/app/95239e26-e0be-42d9- a960-9a5f7f1c25ee/sheet/45beeb74-30ab-46be-8267-5756582633b4/state/analysis

[16] [16]

Stud Health Technol Inform, 2006

Donnelly, K., SNOMED-CT: The advanced terminology and coding system for eHealth. Stud Health Technol Inform, 2006. 121: p. 279-90

2006

[17] [17]

Miller, K.B

Rosenbloom, S.T., R.A. Miller, K.B. Johnson, P.L. Elkin, and S.H. Brown, Interface terminologies: facilitating direct entry of clinical data into electronic health record systems. Journal of the American medical informatics association, 2006. 13(3): p. 277-288

2006

[18] [18]

Sheikh, and B

Duncker, E., J.A. Sheikh, and B. Fields. From global terminology to local terminology: A review on cross-cultural interface design solutions. in Cross-Cultural Design. Methods, Practice, and Case Studies: 5th International Conference, CCD 2013, Held as Part of HCI International 2013, Las Vegas, NV, USA, July 21-26, 2013, Proceedings, Part I 5. 2013. Springer

2013

[19] [19]

Patel, and A.W

Cimino, J.J., V.L. Patel, and A.W. Kushniruk, Studying the human—computer—terminology interface. Journal of the American Medical Informatics Association, 2001. 8(2): p. 163-173

2001

[20] [20]

Ahmadian, R

Bakhshi-Raiez, F., L. Ahmadian, R. Cornet, E. de Jonge, and N. De Keizer, Construction of an Interface Terminology on SNOMED CT. Methods of Information in Medicine, 2010. 49(04): p. 349-359

2010

[21] [21]

Einstein, S

Dehkordi, M.K.H., A.J. Einstein, S. Zhou, G. Elhanan, Y. Perl, V.K. Keloth, J. Geller, and H. Liu, Using annotation for computerized support for fast skimming of cardiology electronic health record notes, in 2023 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). 2023, IEEE. p. 4043-4050

2023

[22] [22]

Kollapally, Y

Dehkordi, M.K.H., N.M. Kollapally, Y. Perl, J. Geller, F.P. Deek, H. Liu, V.K. Keloth, G. Elhanan, and A.J. Einstein. Skimming of Electronic Health Records Highlighted by an Interface Terminology Curated with Machine Learning Mining. in BIOSTEC (2). 2024

2024

[23] [23]

Pollard, L

Johnson, A.E., T.J. Pollard, L. Shen, L.-w.H. Lehman, M. Feng, M. Ghassemi, B. Moody, P. Szolovits, L. Anthony Celi, and R.G. Mark, MIMIC-III, a freely accessible critical care database. Scientific data, 2016. 3(1): p. 1-9

2016

[24] [24]

NLTK's list of english stopwords

github. NLTK's list of english stopwords. 2010; Available from: https://gist.github.com/sebleier/554280

2010

[25] [25]

Common Medical Abbreviations

Association, A.S.-L.-H. Common Medical Abbreviations. Available from: https://www.asha.org/practice-portal/professional-issues/documentation-in-health-care/common- medical-abbreviations/. 29

[26] [26]

Cardiology Abbreviations and Diagnosis

Utah, U.o. Cardiology Abbreviations and Diagnosis. u.d; Available from: http://www.ped.med.utah.edu/pedsintranet/outpatient/triage/team_red/cardio_abbreviations_diagn osis.pdf

[27] [27]

List of medical abbreviations

Wikipedia. List of medical abbreviations. 2015; Available from: https://en.wikipedia.org/wiki/List_of_medical_abbreviations

2015

[28] [28]

github. Negex. Available from: https://github.com/chapmanbe/negex/blob/master/genConText/trigger-neg.txt

[29] [29]

Heart Medications

Heart.org. Heart Medications. u.d; Available from: https://www.heart.org/en/health-topics/heart- attack/treatment-of-a-heart-attack/cardiac-medications

[30] [30]

1000 English Verbs Forms

worldclasslearning. 1000 English Verbs Forms. Available from: https://www.worldclasslearning.com/english/five-verb-forms.html#google_vignette

[31] [31]

Perkins, D

Hardeniya, N., J. Perkins, D. Chopra, N. Joshi, and I. Mathur, Natural language processing: python and NLTK. 2016: Packt Publishing Ltd

2016

[32] [32]

2008, Oxford University Press

Daintith, J., Kleene star, in A Dictionary of Computing. 2008, Oxford University Press

2008

[33] [33]

Almeida, F. and G. Xexéo, Word embeddings: A survey. arXiv preprint arXiv:1901.09069, 2019

work page arXiv 1901

[34] [34]

Publicly Available Clinical BERT Embeddings

Alsentzer, E., J.R. Murphy, W. Boag, W.-H. Weng, D. Jin, T. Naumann, and M. McDermott, Publicly available clinical BERT embeddings. arXiv preprint arXiv:1904.03323, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1904

[35] [35]

Alimova, I. and E. Tutubalina, Multiple features for clinical relation extraction: A machine learning approach. Journal of biomedical informatics, 2020. 103: p. 103382

2020

[36] [36]

Kharde, and A.D

Dongare, A., R. Kharde, and A.D. Kachare, Introduction to artificial neural network. International Journal of Engineering and Innovative Technology (IJEIT), 2012. 2(1): p. 189-194

2012

[37] [37]

Valentin, and B

Abdi, H., D. Valentin, and B. Edelman, Neural networks. 1999: Sage

1999

[38] [38]

Liashchynskyi, P. and P. Liashchynskyi, Grid search, random search, genetic algorithm: a big comparison for NAS. arXiv preprint arXiv:1912.06059, 2019

work page arXiv 1912

[39] [39]

Deep Learning using Rectified Linear Units (ReLU)

Agarap, A.F., Deep learning using rectified linear units (relu). arXiv preprint arXiv:1803.08375, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[40] [40]

Ismail, and S.Q

Jais, I.K.M., A.R. Ismail, and S.Q. Nisa, Adam optimization algorithm for wide and deep neural network. Knowledge Engineering and Data Science, 2019. 2(1): p. 41-46

2019

[41] [41]

Statistics and Computing, 2011

Fushiki, T., Estimation of prediction error by using K-fold cross-validation. Statistics and Computing, 2011. 21: p. 137-146

2011

[42] [42]

Hinton, A

Srivastava, N., G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research, 2014. 15(1): p. 1929-1958

2014

[43] [43]

ArXiv, 2004

Bird, S., NLTK: The Natural Language Toolkit. ArXiv, 2004. cs.CL/0205028

work page arXiv 2004

[44] [44]

Dehkordi, Y

Kollapally, N.M., M.K.H. Dehkordi, Y. Perl, J. Geller, F.P. Deek, H. Liu, V.K. Keloth, G. Elhanan, A.J. Einstein, and S. Zhou. Using clinical entity recognition for curating an interface terminology to aid fast skimming of EHRs. in 2024 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). 2024. IEEE

2024

[45] [45]

White, J., Q. Fu, S. Hays, M. Sandborn, C. Olea, H. Gilbert, A. Elnashar, J. Spencer-Smith, and D.C. Schmidt, A prompt pattern catalog to enhance prompt engineering with chatgpt. arXiv preprint arXiv:2302.11382, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023

[46] [46]

Annals of biomedical engineering, 2023

Giray, L., Prompt engineering with ChatGPT: a guide for academic writers. Annals of biomedical engineering, 2023. 51(12): p. 2629-2633

2023

[47] [47]

Haltaufderheide, J. and R. Ranisch, The ethics of ChatGPT in medicine and healthcare: a systematic review on Large Language Models (LLMs). NPJ digital medicine, 2024. 7(1): p. 183

2024

[48] [48]

Qureshi, A

Hadi, M.U., R. Qureshi, A. Shah, M. Irfan, A. Zafar, M.B. Shaikh, N. Akhtar, J. Wu, and S. Mirjalili, A survey on large language models: Applications, challenges, limitations, and practical usage. Authorea Preprints, 2023

2023

[49] [49]

Karabacak, M. and K. Margetis, Embracing large language models for medical applications: opportunities and challenges. Cureus, 2023. 15(5). 30

2023

[50] [50]

Ruzzetti, A

Miranda, M., E.S. Ruzzetti, A. Santilli, F.M. Zanzotto, S. Bratières, and E. Rodolà, Preserving privacy in large language models: A survey on current threats and solutions. arXiv preprint arXiv:2408.05212, 2024

work page arXiv 2024

[51] [51]

Yao, Y., J. Duan, K. Xu, Y. Cai, Z. Sun, and Y. Zhang, A survey on large language model (llm) security and privacy: The good, the bad, and the ugly. High-Confidence Computing, 2024: p. 100211

2024

[52] [52]

Amini, and Y

Das, B.C., M.H. Amini, and Y. Wu, Security and privacy challenges of large language models: A survey. ACM Computing Surveys, 2024

2024

[53] [53]

Geißler, and P

Zhou, B., D. Geißler, and P. Lukowicz, Misinforming LLMs: vulnerabilities, challenges and opportunities. arXiv preprint arXiv:2408.01168, 2024

work page arXiv 2024

[54] [54]

Drobnjak, and I

Perković, G., A. Drobnjak, and I. Botički. Hallucinations in llms: Understanding and addressing challenges. in 2024 47th MIPRO ICT and Electronics Convention (MIPRO). 2024. IEEE

2024

[55] [55]

Dehkordi, M.K.H., J. Lu, Y. Perl, and F.P. Deek, Enhancing Patient Comprehension of Discharge Notes with a Retrieval-Augmented LLM Approach, in 2025 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). 2025, IEEE

2025

[56] [56]

Gupta, and S.N

Ranjan, R., S. Gupta, and S.N. Singh, A comprehensive survey of bias in llms: Current landscape and future directions. arXiv preprint arXiv:2409.16430, 2024

work page arXiv 2024

[57] [57]

Guo, Y., M. Guo, J. Su, Z. Yang, M. Zhu, H. Li, M. Qiu, and S.S. Liu, Bias in large language models: Origin, evaluation, and mitigation. arXiv preprint arXiv:2411.10915, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024

[58] [58]

Chang, W

Ong, J.C.L., S.Y.-H. Chang, W. William, A.J. Butte, N.H. Shah, L.S.T. Chew, N. Liu, F. Doshi- Velez, W. Lu, and J. Savulescu, Ethical and regulatory challenges of large language models in medicine. The Lancet Digital Health, 2024. 6(6): p. e428-e432

2024

[59] [59]

Bedi, S., Y. Liu, L. Orr-Ewing, D. Dash, S. Koyejo, A. Callahan, J.A. Fries, M. Wornow, A. Swaminathan, and L.S. Lehmann, A Systematic Review of Testing and Evaluation of Healthcare Applications of Large Language Models (LLMs). medRxiv, 2024: p. 2024.04. 15.24305869

2024

[60] [60]

Khoshgoftaar, and D

Weiss, K., T.M. Khoshgoftaar, and D. Wang, A survey of transfer learning. Journal of Big data,

[61] [61]

Yan, and Z

Peng, Y., S. Yan, and Z. Lu. Transfer Learning in Biomedical Natural Language Processing: An Evaluation of BERT and ELMo on Ten Benchmarking Datasets. in BioNLP@ACL. 2019

2019

[62] [62]

Sun, C. and Z. Yang. Transfer Learning in Biomedical Named Entity Recognition: An Evaluation of BERT in the PharmaCoNER task. in Conference on Empirical Methods in Natural Language Processing. 2019

2019

[63] [63]

Giorgi, J. and G.D. Bader, Transfer learning for biomedical named entity recognition with neural networks. Bioinformatics, 2018. 34: p. 4087 - 4094

2018

[64] [64]

Perl, and F.P

Dehkordi, M.K.H., Y. Perl, and F.P. Deek, Optimizing Manual Review Using Machine Learning in Interface Terminology Curation for Automatic EHR Highlighting, in 2025 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). 2025, IEEE

2025