Recent Advances in Generative AI for Healthcare Applications

Jose Colmenarez; Linxia Gu; Matthew M. Nikahd; Pengfei Dong; Sahar Yarmohammadtoosky; Wenxi Liu; Xianqi Li; Yasin Shokrollahi

arxiv: 2310.00795 · v2 · submitted 2023-10-01 · 💻 cs.LG · cs.AI

Recent Advances in Generative AI for Healthcare Applications

Yasin Shokrollahi , Jose Colmenarez , Wenxi Liu , Sahar Yarmohammadtoosky , Matthew M. Nikahd , Pengfei Dong , Xianqi Li , Linxia Gu This is my paper

Pith reviewed 2026-05-24 06:24 UTC · model grok-4.3

classification 💻 cs.LG cs.AI

keywords generative AIhealthcarediffusion modelstransformer architecturesmedical imagingdrug designclinical decision support

0 comments

The pith

Generative AI led by diffusion models and transformers has enabled breakthroughs in medical imaging, protein prediction, and clinical tasks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This review synthesizes recent applications of generative AI in healthcare. It focuses on diffusion models and transformer architectures and their role in medical image tasks, protein structure work, and clinical support functions. The synthesis covers progress in diagnosis assistance, documentation, coding, and drug-related molecular tasks. A reader would care because these applications could change how medical data is processed and decisions are made. The paper also notes current limits and suggests paths for further work.

Core claim

Generative AI, led by diffusion models and transformer architectures, has enabled significant breakthroughs in medical imaging (including image reconstruction, image-to-image translation, generation, and classification), protein structure prediction, clinical documentation, diagnostic assistance, radiology interpretation, clinical decision support, medical coding, and billing, as well as drug design and molecular representation. These innovations have enhanced clinical diagnosis, data reconstruction, and drug synthesis.

What carries the argument

Diffusion models and transformer architectures applied across medical imaging, protein prediction, and clinical workflow tasks.

If this is right

Medical imaging workflows gain new tools for reconstruction, translation, and classification.
Protein structure prediction benefits from generative approaches that improve accuracy.
Clinical documentation, coding, and decision support see efficiency gains.
Drug design and molecular representation tasks become more automated and targeted.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Wider use in hospitals could shift training requirements for medical staff toward AI oversight skills.
Privacy rules around patient data may limit how broadly these models can be trained in practice.
The same model families might extend to non-imaging domains such as electronic health record forecasting.
Validation studies focused on real-world deployment outcomes would be a natural next measurement step.

Load-bearing premise

The review assumes that the body of cited literature provides a representative and unbiased sample of the field without systematic omission of negative results or over-representation of positive ones.

What would settle it

A meta-analysis that identifies many high-quality studies showing no measurable gains from these models in the listed healthcare areas would falsify the claim of significant breakthroughs.

Figures

Figures reproduced from arXiv: 2310.00795 by Jose Colmenarez, Linxia Gu, Matthew M. Nikahd, Pengfei Dong, Sahar Yarmohammadtoosky, Wenxi Liu, Xianqi Li, Yasin Shokrollahi.

**Figure 2.** Figure 2: Timeline of Generative AI Family Types. Since diffusion and transformer-based models have outperformed other types of models/architectures and are increasingly used in the healthcare industry ( [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Generative AI in healthcare. 1.1. Rationale and Distinctiveness of Our Review: In healthcare, generative AI has made significant progress over the years. Many experts have written detailed reviews about deep generative AI models designed specifically for healthcare purposes. These reviews include the studies by (Bohr and Memarzadeh 2020), (AlAmir and AlGhamdi 2022), (Ali et al. 2022), (Kazerouni et al. 202… view at source ↗

**Figure 4.** Figure 4: (left) We observe the representation of self-attention; (right) the illustration delineates the configuration of Multi-Head Attention, which comprises multiple concurrent attention layers (image by (Vaswani et al. 2017)). The process under discussion involves an encoder utilizing a mechanism known as multi-head attention, which extends beyond the singular contextual comprehension found in self-attention. S… view at source ↗

**Figure 5.** Figure 5: The proposed taxonomy for diffusion-based models in health care in six sub-fields, (I) Image Reconstruction 1. (Özbey et al. 2023), 2. (Xie and Li 2022), (Yang et al. 2024), (II) Image to Image Translation 3. (Lyu and Wang 2022), 4. (Özbey et al. 2023), (III) Image Generation 5. (Müller-Franzes et al. 2023), 6. (Pan et al. 2023), (Sun et al. 2024), (IV) Image Classification 7. (H.-J. Oh and Jeong 2023), 8.… view at source ↗

**Figure 6.** Figure 6: Results are shown for (a) T1-weighted acquisitions in the IXI dataset and (b) FLAIR-weighted acquisitions in the fastMRI dataset. Reconstructed images are given along with the reference image derived from fully sampled acquisitions, and zoom-in windows and arrows are included to highlight differences among methods. LORAKS and GAN prior show high noise amplification, rGAN shows residual aliasing, and MoDL s… view at source ↗

**Figure 7.** Figure 7: SynDiff showcased its capability for MRI contrast conversions on the BRATS dataset (Menze et al. 2014). For illustrative purposes, source images, synthesized outputs, and the actual reference images are presented for the following tasks: a) T1 to T2 and b) T2 to FLAIR (Fluid Attenuation Inversion Recovery). The visualization scales utilized are a) [0, 0.75] and b) [0, 0.80]. SynDiff effectively minimizes n… view at source ↗

**Figure 8.** Figure 8: MT-DDPM Framework's Diffusion Methodology (a): Medical imagery is progressively transformed into pure Gaussian noise through incremental noise addition in the forward diffusion. For the backward process, a system is tasked to filter the Gaussian noise back into a pristine image continuously. (b): MT-DDPM Network Design: This network uses a balanced encoder-decoder design to master the backward process. The… view at source ↗

**Figure 9.** Figure 9: An overview of the DiffMIC framework includes (a) The forward process during the training phase (b) The reverse process for inference. (c) The DCG Model 𝜏𝐷 directs the diffusion using dual priors from the raw image and ROIs (Y. Yang et al. 2023). 3.1.4. Image Classification [PITH_FULL_IMAGE:figures/full_fig_p018_9.png] view at source ↗

**Figure 10.** Figure 10: The proposed taxonomy for Transformer-based models in health care in seven sub-fields, (I) protein structure prediction 1. (Vig et al. 2020), 2. (Behjati et al. 2022), 3. (Abdine et al. 2023), 4. (Boadu, Cao, and Cheng 2023), 5. (Y. Cao and Shen 2021), 6. (Geffen, Ofran, and Unger 2022), 7. (Castro et al [PITH_FULL_IMAGE:figures/full_fig_p020_10.png] view at source ↗

**Figure 11.** Figure 11: Structure of the Deep-Learning Line Classification Model. Textual and layout characteristics of each line are embedded to generate a unique representation for each line, which is subsequently contextualized using a four-layer Transformer that employs self-attention with relative position information. Finally, the representations are classified using a linear layer and a SoftMax function to determine the p… view at source ↗

**Figure 12.** Figure 12: Structure of the Deep-Learning Line Classification Model. Textual and layout characteristics of each line are embedded to generate a unique representation for each line, which is subsequently contextualized using a four-layer Transformer that employs self-attention with relative position information. Finally, the representations are classified using a linear layer and a SoftMax function to determine the p… view at source ↗

**Figure 13.** Figure 13: Schematic of the Proposed Methodology. The visual search patterns of radiologists on chest radiographs serve as the initial input for training a global-focal teacher network, denoted as Human Visual Attention Training. Subsequently, this pre-trained teacher network instructs the global-focal student network to acquire visual attention utilizing a newly devised Visual Attention Loss. Implementing the stude… view at source ↗

**Figure 15.** Figure 15: The proposed architecture of the Prot2Text framework is designed for predicting protein function descriptions in the free-form text (picture by (Abdine et al. 2023)). Recent advancements in protein structure prediction and function annotation have been driven by innovative techniques such as TransFun (Boadu, Cao, and Cheng 2023), TALE (Y. Cao and Shen 2021), DistilProtBert (Geffen, Ofran, and Unger 2022),… view at source ↗

**Figure 16.** Figure 16: Pipeline for training and generation using the MolGPT model (Bagal et al. 2022). (K. Huang et al. 2021) proposed MolTrans, an approach that combines a knowledge-inspired substructural pattern mining algorithm, an interaction modeling module, and an enhanced transformer encoder for more precise and interpretable drug-target interaction (DTI) predictions. This approach outperformed leading-edge baseline mo… view at source ↗

read the original abstract

The rapid advancement of Artificial Intelligence (AI) has catalyzed revolutionary changes across various sectors, notably in healthcare. In particular, generative AI-led by diffusion models and transformer architectures-has enabled significant breakthroughs in medical imaging (including image reconstruction, image-to-image translation, generation, and classification), protein structure prediction, clinical documentation, diagnostic assistance, radiology interpretation, clinical decision support, medical coding, and billing, as well as drug design and molecular representation. These innovations have enhanced clinical diagnosis, data reconstruction, and drug synthesis. This review paper aims to offer a comprehensive synthesis of recent advances in healthcare applications of generative AI, with an emphasis on diffusion and transformer models. Moreover, we discuss current capabilities, highlight existing limitations, and outline promising research directions to address emerging challenges. Serving as both a reference for researchers and a guide for practitioners, this work offers an integrated view of the state of the art, its impact on healthcare, and its future potential.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This is a review paper that organizes existing generative AI work in healthcare but provides no new results and no explicit method for selecting or balancing the cited studies.

read the letter

The main thing to know is that this is a survey paper with no original experiments, derivations, or techniques. It compiles published work on diffusion models and transformers applied to medical imaging, protein prediction, clinical notes, diagnostics, drug design, and a few other areas, then lists limitations and future directions. That synthesis is the only contribution. It does a reasonable job laying out the application categories in one place, which could save a newcomer some time hunting through the literature. The abstract flags that the authors want to highlight capabilities, limitations, and open problems, and the structure appears to follow those headings. The central claim that these models have produced significant breakthroughs across nine domains rests entirely on the selection and interpretation of prior papers. The stress-test note is on point here: there is no mention in the provided abstract of a search protocol, inclusion criteria, or handling of negative results. If the full text also skips that, the narrative of widespread enabled breakthroughs cannot be checked for selection bias. That is the main soft spot, and it is not minor for a review whose value depends on representativeness. No new data or formal verification is offered to strengthen the synthesis. This paper is for readers who want a quick map of the generative AI healthcare landscape rather than a deep technical dive or a critical evaluation. It is not strong enough on its own to change how anyone thinks about the field. I would bring it to a reading group only if the group is specifically looking for recent surveys to discuss coverage gaps. I would not cite it in my own work. A serious editor could send it to peer review so referees can check whether the reference list is balanced and whether the limitations section is accurate, but the paper would need a methods section added first.

Referee Report

1 major / 0 minor

Summary. This review paper claims that generative AI, led by diffusion models and transformer architectures, has enabled significant breakthroughs across nine healthcare domains: medical imaging (reconstruction, translation, generation, classification), protein structure prediction, clinical documentation, diagnostic assistance, radiology interpretation, clinical decision support, medical coding and billing, and drug design/molecular representation. It positions itself as a comprehensive synthesis of recent advances, with discussion of current capabilities, limitations, and future research directions, serving as a reference for researchers and guide for practitioners.

Significance. A well-executed survey with transparent selection criteria could provide a useful integrated view of the state of the art in an active area. However, the headline narrative of widespread 'significant breakthroughs' rests entirely on the representativeness and balance of the cited literature; without that, the synthesis does not add substantial new insight beyond existing individual papers.

major comments (1)

[Abstract / Introduction] Abstract and Introduction: the central claim that diffusion and transformer models have produced 'significant breakthroughs' in the nine listed domains is supported solely by selection and interpretation of prior publications. No explicit literature-search protocol, inclusion/exclusion criteria, date range, or handling of negative/null results is described, making it impossible to evaluate whether the cited works constitute a representative sample or over-represent positive demonstrations.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback. We address the single major comment below and will revise the manuscript to improve transparency on literature selection.

read point-by-point responses

Referee: [Abstract / Introduction] Abstract and Introduction: the central claim that diffusion and transformer models have produced 'significant breakthroughs' in the nine listed domains is supported solely by selection and interpretation of prior publications. No explicit literature-search protocol, inclusion/exclusion criteria, date range, or handling of negative/null results is described, making it impossible to evaluate whether the cited works constitute a representative sample or over-represent positive demonstrations.

Authors: We agree that an explicit description of the literature selection process is needed for transparency. Although the paper is framed as a narrative synthesis of recent advances rather than a formal systematic review, we will add a dedicated subsection in the Introduction (or a new 'Methods' section) detailing the approach taken. This will include: search databases (PubMed, arXiv, Google Scholar), date range (primarily 2020–2023), keywords (combinations of 'diffusion model', 'transformer', 'generative AI' with each of the nine healthcare domains), inclusion criteria (peer-reviewed or preprint works reporting empirical applications or benchmarks), and exclusion criteria (purely theoretical works or non-generative methods). We will also expand the existing limitations discussion to note potential publication bias and reference studies highlighting challenges or null results where relevant. These changes will allow readers to better evaluate the synthesis. revision: yes

Circularity Check

0 steps flagged

No circularity: survey of external literature only

full rationale

This is a review paper whose central claims consist of summaries of cited external publications on diffusion models, transformers, and their healthcare applications. No equations, fitted parameters, derivations, or internal predictions appear in the abstract or described structure. No self-citation is invoked as a load-bearing uniqueness theorem or ansatz. The representativeness concern raised by the skeptic is a question of selection bias, not a reduction of any claimed result to the paper's own inputs by construction. Therefore the derivation chain is empty and the circularity score is 0.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is a literature review with no new mathematical content, free parameters, axioms, or invented entities.

pith-pipeline@v0.9.0 · 5718 in / 997 out tokens · 18645 ms · 2026-05-24T06:24:52.048689+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

27 extracted references · 27 canonical work pages · 1 internal anchor

[1]

Comparative Review Table 1 comprehensively categorizes the assessed diffusion model papers by their application, key findings, and the employed or inspired algorithms, including DDPMs, NCSNs, and SDEs. It highlights each algorithm's fundamental concepts and objectives and the poten tial practical applications that can be explored and implemented in future...

work page 2023
[2]

(Abdine et al

ProtAlbert Protein sequence profile prediction Novel methods for interpreting attention weights led to more accurate predictions of protein sequence profiles. (Abdine et al

work page
[3]

(Boadu, Cao, and Cheng 2023) TransFun Protein function prediction Combining protein sequences and 3D structures leads to accurate protein function prediction

Prot2Text Protein function description Combining Graph Neural Networks (GNNs) and LLMs provides detailed and accurate descriptions of protein functions. (Boadu, Cao, and Cheng 2023) TransFun Protein function prediction Combining protein sequences and 3D structures leads to accurate protein function prediction. (Y. Cao and Shen 2021) TALE Protein function ...

work page 2023
[4]

Ferruz, Schmidt, and Höcker 2022 ProtGPT2 Novel protein sequence generation Language models trained on protein sequences can generate novel proteins that mimic natural ones

ReLSO Protein sequence optimization Transformer-based autoencoder optimizes protein sequences for fitness landscape navigation. Ferruz, Schmidt, and Höcker 2022 ProtGPT2 Novel protein sequence generation Language models trained on protein sequences can generate novel proteins that mimic natural ones. (Ferruz, Schmidt, and Höcker 2022) ProteinBERT Protein ...

work page 2022
[5]

Clinical documentation and (Gérardin et al

ProteinBERT Protein sequence processing A specialized deep language model amalgamates local and global representations for comprehensive end-to- end processing. Clinical documentation and (Gérardin et al. 2023) Transformer deep neural network Analyzing the layout of PDF clinical documents Developed and validated an algorithm for extracting clinically rele...

work page 2023
[6]

BioGottBERT German clinical notes Newly trained BioGottBERT model outperformed the GottBERT model in clinical named entity recognition (NER) tasks. (Y. Li et al. 2022) ClinicalLongformer, Clinical-BigBird Clinical text Introduced two domain-specific language models pre-trained on a large corpus of clinical text, improving several downstream clinical NLP t...

work page 2022
[7]

BART Summarization of Biomedical Health Care (BHC) data Proposed an advanced abstractive summarization model based on BART that includes a clinical oncology-aware guidance signal for key terms, facilitating the creation of problem-list-oriented abstractive summaries. (Sivarajkumar and Wang 2022) HealthPrompt Clinical texts Proposed a groundbreaking prompt...

work page 2022
[8]

(Yogarajan et al

ClinicalLayoutLM Categorizing scanned clinical documents Introduced a multimodal technique that combined text obtained from optical character recognition (OCR) with layout or image information, outperforming the baseline model (which relied solely on OCR text) in classifying scanned clinical documents into 16 categories. (Yogarajan et al. 2021) Domain-spe...

work page 2021
[9]

YOLO Utilization of AI to enhance real-time detection and segmentation of nasopharyngeal carcinoma (NPC) during endoscopic examinations The study developed a deep learning-based model employing the YOLO (You Only Look Once) network, which demonstrated high performance in accurately identifying and segmenting NPC lesions in real-time during endoscopic proc...

work page 2022
[10]

ED-Copilot Reducing emergency department wait times through AI-assisted diagnostic recommendations ED-Copilot effectively personalized treatment recommendations based on patient severity, highlighting its potential as a diagnostic assistant to improve efficiency in emergency departments Medical imaging and radiology interpretation (Balouch and Hussain 202...

work page 2023
[11]

(Nimalsiri et al

TrMRG Report generation Achieved noteworthy results compared to prevailing methods. (Nimalsiri et al

work page
[12]

Clinical Decision Support (J

MERGIS Automated report generation Utilized image segmentation and a modern transformer-based encoder-decoder model to enhance the accuracy of automated report generation. Clinical Decision Support (J. Feng, Shaib, and Rudzicz 2020) Hierarchical CNN transformer, ClinicalBERT Sepsis prediction, ICU mortality The model captures phrase-level patterns and glo...

work page 2020
[13]

It captured the underlying hierarchical structures in medical codes and outperformed all baseline models regarding prediction accuracy for medication recommendation tasks

G-BERT model Articulating medical codes and recommending medications G-BERT combines the strengths of Graph Neural Networks (GNNs) and BERT for articulating medical codes and recommending medications. It captured the underlying hierarchical structures in medical codes and outperformed all baseline models regarding prediction accuracy for medication recomm...

work page 2023
[14]

(Fabian et al

MolGPT Generate compounds with targeted scaffolds and chemical characteristics MolGPT utilizes scaffold SMILES strings to construct molecules with property values that deviate from the provided values while maintaining the ability to produce molecules with user-specified scaffolds. (Fabian et al

work page
[15]

MOLBERT utilized learned molecular representations and outperformed prevailing state-of-the-art models on benchmark datasets

MOLBERT Predict drug-target interactions, manage molecular properties and Virtual Screening. MOLBERT utilized learned molecular representations and outperformed prevailing state-of-the-art models on benchmark datasets. The study highlighted the importance of selecting appropriate self- supervised tasks during pre-training. (K. Huang et al

work page
[16]

It outperformed leading-edge baseline models in a comparative analysis using real-world data

MolTrans More precise and interpretable drug- target interaction (DTI) predictions MolTrans combines a knowledge-inspired sub-structural pattern mining algorithm, an interaction modeling module, and an enhanced transformer encoder. It outperformed leading-edge baseline models in a comparative analysis using real-world data. (H. Li, Zhao, and Zeng 2022) KP...

work page 2022
[17]

GROVER combines Message Passing Networks with Transformer-style architecture to create more expressive encoders for complex information

GROVER Interpreting structural and semantic details about molecules, predict the existence of semantic motifs in molecules. GROVER combines Message Passing Networks with Transformer-style architecture to create more expressive encoders for complex information. It identifies semantic motifs in molecular networks and predicts their existence in a molecule u...

work page
[18]

MolEdit3D Structure-based drug design through 3D molecular generation and optimization MolEdit3D combines 3D molecular generation with optimization frameworks, employing a novel 3D graph editing model pre-trained on extensive 3D ligand data. This approach enhances the generation of molecules with favorable target-dependent and target-independent propertie...

work page
[19]

Token-Mol Tokenized drug design utilizing large language models Token-Mol encodes comprehensive molecular information, including 2D and 3D structures, into tokenized formats, transforming drug discovery tasks into probabilistic prediction problems. Through fine-tuning and reinforcement learning, Token-Mol demonstrates performance comparable to or surpassi...

work page
[20]

black box

Future Direction and Open Challenges Generative AI, including diffusion models and transformer -based models, has showcased remarkable potential in the healthcare domain, particularly in medical imaging and disease diagnostics, by overcoming hurdles encountered by earlier models. It does not necessitate labeled data, making it a potent tool for numerous m...

work page 2022
[21]

We differentiated the diffusion models into three main categories: DDPMs, NCSNs, and SDEs, and elaborated on the attention mechanisms within transformers

Conclusion In this study, we explored extensively the literature surrounding diffusion and transformed -based models, emphasizing their use in healthcare. We differentiated the diffusion models into three main categories: DDPMs, NCSNs, and SDEs, and elaborated on the attention mechanisms within transformers. Then we investigated diffusion models' roles in...

work page
[22]

Acknowledgement XXXXXXX

work page
[23]

Prot2Text: Multimodal Protein’s Function Generation with GNNs and Transformers

Statements & Declarations Ethics Approval Ethical approval was obtained from the ethics committee of XXXXX Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Consent to Participate Informed consent was obtained from al...

work page arXiv 2023
[24]

The Role of Generative Adversarial Networks in Brain MRI: A Scoping Review

“The Role of Generative Adversarial Networks in Brain MRI: A Scoping Review.” Insights into Imaging 13 (1): 98. Asperti, Andrea. 2019. “Variational Autoencoders and the Variable Collapse Phenomenon.” Sensors & Transducers 234 (6): 1–8. Azad, Reza, Moein Heidari, Moein Shariatnia, Ehsan Khodapanah Aghdam, Sanaz Karimijafarbigloo, Ehsan Adeli, and Dorit Mer...

work page doi:10.3389/fdgth.2022.1065581 2019
[25]

HealthPrompt: A Zero-Shot Learning Paradigm for Clinical Natural Language Processing

28 Nov. 2024, doi:10.1055/a-2491-3872. Sivarajkumar, Sonish, and Yanshan Wang. 2022. “HealthPrompt: A Zero-Shot Learning Paradigm for Clinical Natural Language Processing.” AMIA ... Annual Symposium Proceedings. AMIA Symposium 2022: 972–81. Sohl-Dickstein, Jascha, Eric Weiss, Niru Maheswaranathan, and Surya Ganguli. 2015. “Deep Unsupervised Learning Using...

work page doi:10.1055/a-2491-3872 2024
[26]

Score-Based Generative Modeling through Stochastic Differential Equations

“Score-Based Generative Modeling through Stochastic Differential Equations.” arXiv Preprint arXiv:2011.13456. Strubell, Emma, Ananya Ganesh, and Andrew McCallum. 2019. “Energy and Policy Considerations for Deep Learning in NLP.” arXiv Preprint arXiv:1906.02243. Sun, Bohang, and Pietro Liò. "Multi-Head Explainer: A General Framework to Improve Explainabili...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2304.02886 2011
[27]

BERTology Meets Biology: Interpreting Attention in Protein Language Models

“BERTology Meets Biology: Interpreting Attention in Protein Language Models.” https://doi.org/10.48550/ARXIV.2006.15222. Vincent, Pascal. 2011. “A Connection between Score Matching and Denoising Autoencoders.” Neural Computation 23 (7): 1661–74. Waibel, Dominik JE, Ernst Röoell, Bastian Rieck, Raja Giryes, and Carsten Marr. 2022. “A Diffusion Model Predic...

work page doi:10.48550/arxiv.2006.15222 2006

[1] [1]

Comparative Review Table 1 comprehensively categorizes the assessed diffusion model papers by their application, key findings, and the employed or inspired algorithms, including DDPMs, NCSNs, and SDEs. It highlights each algorithm's fundamental concepts and objectives and the poten tial practical applications that can be explored and implemented in future...

work page 2023

[2] [2]

(Abdine et al

ProtAlbert Protein sequence profile prediction Novel methods for interpreting attention weights led to more accurate predictions of protein sequence profiles. (Abdine et al

work page

[3] [3]

(Boadu, Cao, and Cheng 2023) TransFun Protein function prediction Combining protein sequences and 3D structures leads to accurate protein function prediction

Prot2Text Protein function description Combining Graph Neural Networks (GNNs) and LLMs provides detailed and accurate descriptions of protein functions. (Boadu, Cao, and Cheng 2023) TransFun Protein function prediction Combining protein sequences and 3D structures leads to accurate protein function prediction. (Y. Cao and Shen 2021) TALE Protein function ...

work page 2023

[4] [4]

Ferruz, Schmidt, and Höcker 2022 ProtGPT2 Novel protein sequence generation Language models trained on protein sequences can generate novel proteins that mimic natural ones

ReLSO Protein sequence optimization Transformer-based autoencoder optimizes protein sequences for fitness landscape navigation. Ferruz, Schmidt, and Höcker 2022 ProtGPT2 Novel protein sequence generation Language models trained on protein sequences can generate novel proteins that mimic natural ones. (Ferruz, Schmidt, and Höcker 2022) ProteinBERT Protein ...

work page 2022

[5] [5]

Clinical documentation and (Gérardin et al

ProteinBERT Protein sequence processing A specialized deep language model amalgamates local and global representations for comprehensive end-to- end processing. Clinical documentation and (Gérardin et al. 2023) Transformer deep neural network Analyzing the layout of PDF clinical documents Developed and validated an algorithm for extracting clinically rele...

work page 2023

[6] [6]

BioGottBERT German clinical notes Newly trained BioGottBERT model outperformed the GottBERT model in clinical named entity recognition (NER) tasks. (Y. Li et al. 2022) ClinicalLongformer, Clinical-BigBird Clinical text Introduced two domain-specific language models pre-trained on a large corpus of clinical text, improving several downstream clinical NLP t...

work page 2022

[7] [7]

BART Summarization of Biomedical Health Care (BHC) data Proposed an advanced abstractive summarization model based on BART that includes a clinical oncology-aware guidance signal for key terms, facilitating the creation of problem-list-oriented abstractive summaries. (Sivarajkumar and Wang 2022) HealthPrompt Clinical texts Proposed a groundbreaking prompt...

work page 2022

[8] [8]

(Yogarajan et al

ClinicalLayoutLM Categorizing scanned clinical documents Introduced a multimodal technique that combined text obtained from optical character recognition (OCR) with layout or image information, outperforming the baseline model (which relied solely on OCR text) in classifying scanned clinical documents into 16 categories. (Yogarajan et al. 2021) Domain-spe...

work page 2021

[9] [9]

YOLO Utilization of AI to enhance real-time detection and segmentation of nasopharyngeal carcinoma (NPC) during endoscopic examinations The study developed a deep learning-based model employing the YOLO (You Only Look Once) network, which demonstrated high performance in accurately identifying and segmenting NPC lesions in real-time during endoscopic proc...

work page 2022

[10] [10]

ED-Copilot Reducing emergency department wait times through AI-assisted diagnostic recommendations ED-Copilot effectively personalized treatment recommendations based on patient severity, highlighting its potential as a diagnostic assistant to improve efficiency in emergency departments Medical imaging and radiology interpretation (Balouch and Hussain 202...

work page 2023

[11] [11]

(Nimalsiri et al

TrMRG Report generation Achieved noteworthy results compared to prevailing methods. (Nimalsiri et al

work page

[12] [12]

Clinical Decision Support (J

MERGIS Automated report generation Utilized image segmentation and a modern transformer-based encoder-decoder model to enhance the accuracy of automated report generation. Clinical Decision Support (J. Feng, Shaib, and Rudzicz 2020) Hierarchical CNN transformer, ClinicalBERT Sepsis prediction, ICU mortality The model captures phrase-level patterns and glo...

work page 2020

[13] [13]

It captured the underlying hierarchical structures in medical codes and outperformed all baseline models regarding prediction accuracy for medication recommendation tasks

G-BERT model Articulating medical codes and recommending medications G-BERT combines the strengths of Graph Neural Networks (GNNs) and BERT for articulating medical codes and recommending medications. It captured the underlying hierarchical structures in medical codes and outperformed all baseline models regarding prediction accuracy for medication recomm...

work page 2023

[14] [14]

(Fabian et al

MolGPT Generate compounds with targeted scaffolds and chemical characteristics MolGPT utilizes scaffold SMILES strings to construct molecules with property values that deviate from the provided values while maintaining the ability to produce molecules with user-specified scaffolds. (Fabian et al

work page

[15] [15]

MOLBERT utilized learned molecular representations and outperformed prevailing state-of-the-art models on benchmark datasets

MOLBERT Predict drug-target interactions, manage molecular properties and Virtual Screening. MOLBERT utilized learned molecular representations and outperformed prevailing state-of-the-art models on benchmark datasets. The study highlighted the importance of selecting appropriate self- supervised tasks during pre-training. (K. Huang et al

work page

[16] [16]

It outperformed leading-edge baseline models in a comparative analysis using real-world data

MolTrans More precise and interpretable drug- target interaction (DTI) predictions MolTrans combines a knowledge-inspired sub-structural pattern mining algorithm, an interaction modeling module, and an enhanced transformer encoder. It outperformed leading-edge baseline models in a comparative analysis using real-world data. (H. Li, Zhao, and Zeng 2022) KP...

work page 2022

[17] [17]

GROVER combines Message Passing Networks with Transformer-style architecture to create more expressive encoders for complex information

GROVER Interpreting structural and semantic details about molecules, predict the existence of semantic motifs in molecules. GROVER combines Message Passing Networks with Transformer-style architecture to create more expressive encoders for complex information. It identifies semantic motifs in molecular networks and predicts their existence in a molecule u...

work page

[18] [18]

MolEdit3D Structure-based drug design through 3D molecular generation and optimization MolEdit3D combines 3D molecular generation with optimization frameworks, employing a novel 3D graph editing model pre-trained on extensive 3D ligand data. This approach enhances the generation of molecules with favorable target-dependent and target-independent propertie...

work page

[19] [19]

Token-Mol Tokenized drug design utilizing large language models Token-Mol encodes comprehensive molecular information, including 2D and 3D structures, into tokenized formats, transforming drug discovery tasks into probabilistic prediction problems. Through fine-tuning and reinforcement learning, Token-Mol demonstrates performance comparable to or surpassi...

work page

[20] [20]

black box

Future Direction and Open Challenges Generative AI, including diffusion models and transformer -based models, has showcased remarkable potential in the healthcare domain, particularly in medical imaging and disease diagnostics, by overcoming hurdles encountered by earlier models. It does not necessitate labeled data, making it a potent tool for numerous m...

work page 2022

[21] [21]

We differentiated the diffusion models into three main categories: DDPMs, NCSNs, and SDEs, and elaborated on the attention mechanisms within transformers

Conclusion In this study, we explored extensively the literature surrounding diffusion and transformed -based models, emphasizing their use in healthcare. We differentiated the diffusion models into three main categories: DDPMs, NCSNs, and SDEs, and elaborated on the attention mechanisms within transformers. Then we investigated diffusion models' roles in...

work page

[22] [22]

Acknowledgement XXXXXXX

work page

[23] [23]

Prot2Text: Multimodal Protein’s Function Generation with GNNs and Transformers

Statements & Declarations Ethics Approval Ethical approval was obtained from the ethics committee of XXXXX Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Consent to Participate Informed consent was obtained from al...

work page arXiv 2023

[24] [24]

The Role of Generative Adversarial Networks in Brain MRI: A Scoping Review

“The Role of Generative Adversarial Networks in Brain MRI: A Scoping Review.” Insights into Imaging 13 (1): 98. Asperti, Andrea. 2019. “Variational Autoencoders and the Variable Collapse Phenomenon.” Sensors & Transducers 234 (6): 1–8. Azad, Reza, Moein Heidari, Moein Shariatnia, Ehsan Khodapanah Aghdam, Sanaz Karimijafarbigloo, Ehsan Adeli, and Dorit Mer...

work page doi:10.3389/fdgth.2022.1065581 2019

[25] [25]

HealthPrompt: A Zero-Shot Learning Paradigm for Clinical Natural Language Processing

28 Nov. 2024, doi:10.1055/a-2491-3872. Sivarajkumar, Sonish, and Yanshan Wang. 2022. “HealthPrompt: A Zero-Shot Learning Paradigm for Clinical Natural Language Processing.” AMIA ... Annual Symposium Proceedings. AMIA Symposium 2022: 972–81. Sohl-Dickstein, Jascha, Eric Weiss, Niru Maheswaranathan, and Surya Ganguli. 2015. “Deep Unsupervised Learning Using...

work page doi:10.1055/a-2491-3872 2024

[26] [26]

Score-Based Generative Modeling through Stochastic Differential Equations

“Score-Based Generative Modeling through Stochastic Differential Equations.” arXiv Preprint arXiv:2011.13456. Strubell, Emma, Ananya Ganesh, and Andrew McCallum. 2019. “Energy and Policy Considerations for Deep Learning in NLP.” arXiv Preprint arXiv:1906.02243. Sun, Bohang, and Pietro Liò. "Multi-Head Explainer: A General Framework to Improve Explainabili...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2304.02886 2011

[27] [27]

BERTology Meets Biology: Interpreting Attention in Protein Language Models

“BERTology Meets Biology: Interpreting Attention in Protein Language Models.” https://doi.org/10.48550/ARXIV.2006.15222. Vincent, Pascal. 2011. “A Connection between Score Matching and Denoising Autoencoders.” Neural Computation 23 (7): 1661–74. Waibel, Dominik JE, Ernst Röoell, Bastian Rieck, Raja Giryes, and Carsten Marr. 2022. “A Diffusion Model Predic...

work page doi:10.48550/arxiv.2006.15222 2006