MI-CXR is a new benchmark that shows state-of-the-art vision-language models achieve only 29.3% accuracy on longitudinal reasoning tasks across multi-visit chest X-ray sequences.
Title resolution pending
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3representative citing papers
Vocabulary adaptation via targeted token addition and replacement improves semantic similarity, domain word usage, and training efficiency for LLM summarization in legal and medical domains.
MedScribe reformulates CT radiology reporting as an agentic evidence-acquisition workflow using LLM-invoked diagnostic tools and pathology-aligned retrieval, yielding higher clinical accuracy and consistency than standard VLMs on CT-RATE and RadChestCT.
citing papers explorer
-
MI-CXR: A Benchmark for Longitudinal Reasoning over Multi-Interval Chest X-rays
MI-CXR is a new benchmark that shows state-of-the-art vision-language models achieve only 29.3% accuracy on longitudinal reasoning tasks across multi-visit chest X-ray sequences.
-
Learning Faster with Better Tokens: Parameter-Efficient Vocabulary Adaptation for Specialized Text Summarization
Vocabulary adaptation via targeted token addition and replacement improves semantic similarity, domain word usage, and training efficiency for LLM summarization in legal and medical domains.
-
MedScribe: Clinically Grounded CT Reporting through Agentic Workflows
MedScribe reformulates CT radiology reporting as an agentic evidence-acquisition workflow using LLM-invoked diagnostic tools and pathology-aligned retrieval, yielding higher clinical accuracy and consistency than standard VLMs on CT-RATE and RadChestCT.