Mmed-rag: Versatile multimodal rag system for medical vision language models

Peng Xia, Kangyu Zhu, Haoran Li, Tianze Wang, Weijia Shi, Sheng Wang, Linjun Zhang, James Zou, Huaxiu Yao · 2024 · arXiv 2410.13085

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

read on arXiv browse 4 citing papers

citation-role summary

background 1 baseline 1

citation-polarity summary

background 1 baseline 1

representative citing papers

Neuro-Oracle: A Trajectory-Aware Agentic RAG Framework for Interpretable Epilepsy Surgical Prognosis

cs.MM · 2026-04-10 · unverdicted · novelty 6.0

Neuro-Oracle distills longitudinal MRI changes into trajectory vectors via a 3D Siamese encoder, retrieves similar cases, and generates LLM-based prognoses, achieving AUC 0.834-0.905 on a resection-type proxy task versus 0.793 for single-timepoint baseline.

SemEnrich: Self-Supervised Semantic Enrichment of Radiology Reports for Vision-Language Learning

cs.LG · 2026-04-10 · unverdicted · novelty 6.0

SemEnrich enriches radiology reports with positive/neutral findings via self-supervised semantic clustering, yielding average gains of 5-7% on COMET, BERT score, Sentence BLEU, CheXbert-F1 and RadGraph-F1 after fine-tuning, plus further gains when cluster info is added to GRPO rewards.

MicroWorld: Empowering Multimodal Large Language Models to Bridge the Microscopic Domain Gap with Multimodal Attribute Graph

cs.CV · 2026-05-11 · unverdicted · novelty 5.0

MicroWorld constructs a multimodal attributed property graph from scientific image-caption data and augments MLLM prompts via retrieval to raise Qwen3-VL-8B performance by 37.5% on MicroVQA and 6% on MicroBench.

Retina-RAG: Retrieval-Augmented Vision-Language Modeling for Joint Retinal Diagnosis and Clinical Report Generation

cs.CV · 2026-05-07 · unverdicted · novelty 4.0 · 2 refs

Retina-RAG combines a retinal classifier, LoRA-tuned Qwen2.5-VL, and RAG to jointly grade DR, detect ME, and generate reports, reaching F1 scores of 0.731 and 0.948 while exceeding baselines on ROUGE-L and SBERT metrics.

citing papers explorer

Showing 4 of 4 citing papers.

Neuro-Oracle: A Trajectory-Aware Agentic RAG Framework for Interpretable Epilepsy Surgical Prognosis cs.MM · 2026-04-10 · unverdicted · none · ref 33
Neuro-Oracle distills longitudinal MRI changes into trajectory vectors via a 3D Siamese encoder, retrieves similar cases, and generates LLM-based prognoses, achieving AUC 0.834-0.905 on a resection-type proxy task versus 0.793 for single-timepoint baseline.
SemEnrich: Self-Supervised Semantic Enrichment of Radiology Reports for Vision-Language Learning cs.LG · 2026-04-10 · unverdicted · none · ref 25
SemEnrich enriches radiology reports with positive/neutral findings via self-supervised semantic clustering, yielding average gains of 5-7% on COMET, BERT score, Sentence BLEU, CheXbert-F1 and RadGraph-F1 after fine-tuning, plus further gains when cluster info is added to GRPO rewards.
MicroWorld: Empowering Multimodal Large Language Models to Bridge the Microscopic Domain Gap with Multimodal Attribute Graph cs.CV · 2026-05-11 · unverdicted · none · ref 28
MicroWorld constructs a multimodal attributed property graph from scientific image-caption data and augments MLLM prompts via retrieval to raise Qwen3-VL-8B performance by 37.5% on MicroVQA and 6% on MicroBench.
Retina-RAG: Retrieval-Augmented Vision-Language Modeling for Joint Retinal Diagnosis and Clinical Report Generation cs.CV · 2026-05-07 · unverdicted · none · ref 15 · 2 links
Retina-RAG combines a retinal classifier, LoRA-tuned Qwen2.5-VL, and RAG to jointly grade DR, detect ME, and generate reports, reaching F1 scores of 0.731 and 0.948 while exceeding baselines on ROUGE-L and SBERT metrics.

Mmed-rag: Versatile multimodal rag system for medical vision language models

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer