CheXTemporal supplies paired chest X-rays with explicit temporal progression taxonomy and spatial grounding to benchmark and improve models on longitudinal reasoning tasks.
hub
Maira-2: Grounded radiology report gener- ation
19 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
representative citing papers
SHOVIR is a benchmark extending MIMIC-CXR and PadChest-GR with per-box labels and occlusion tests to isolate direct and contextual vision shortcuts in VLMs for radiology report generation.
Transition-aware best-of-N sampling embeds report sentences as sets, computes directional transition vectors via set-to-set distances, and scores candidates by proximity to ground-truth training transitions.
Astra is a 3D CT vision-language foundation model trained on 90,678 thoracoabdominal scans that claims 44.1% better diagnostic metrics on internal and six external cohorts plus 29.6% faster chest reporting in real workflows.
CCS selects the best radiology report from multiple MLLM candidates by measuring clinical consensus with combined text and multimodal embedding utilities, yielding gains over single-path and Best-of-N baselines on clinical metrics across three datasets.
CoNNS uses an LLM-built concept ontology and cross-patient relabeling to filter noisy negatives, improving zero-shot classification and grounding of chest X-ray findings over prior methods.
Clinical VLMs over-rely on text modality, irrelevant clinical history, and prompt wording when making chest x-ray decisions on MIMIC-CXR data.
A spectral vision transformer achieves equitable or superior performance with fewer parameters than standard ViTs, CNNs, and other models by using spectral projections for tokenization in limited-data medical imaging.
DCP-PD improves macro F1 scores on CT report generation benchmarks and introduces a hierarchical location-aware evaluation protocol that reveals ongoing challenges in pathology spatial grounding.
MARL-Rad trains region-specific and global agents with reinforcement learning on clinical rewards to produce more accurate radiology reports than prior methods on MIMIC-CXR and IU X-ray datasets.
RA-RRG extracts key phrases with LLMs, retrieves them via multimodal similarity, and conditions report generation on them to achieve SOTA CheXbert scores and competitive RadGraph F1 on MIMIC-CXR and IU X-ray while supporting multi-view inputs.
PMC-InterCPT builds a context-grounded biomedical interleaved corpus from PMC literature and shows it improves multimodal performance on Qwen3.5-4B-Base after CPT and SFT while using fewer tokens.
RadGenome-Anatomy is a large-scale chest radiograph dataset with anatomy labels obtained by projecting 3D CT masks into 2D radiographic space for 210 structures in 25,692 studies.
MedMIX combines intra-modality expert fusion, learned inter-modality fusion, and training-only large-small collaboration to deliver robust multimodal medical prediction under incomplete modalities across three benchmarks.
Medical image parsing is proposed as the central output for the field instead of masks, with an audit showing that none of eleven representative systems produces a well-formed parse containing attributes, relationships, and closure.
LoFi adds location-aware captioning loss to jointly optimize fine-grained representations, yielding better retrieval and grounding on MIMIC-CXR and PadChest-GR.
RadAgents is a multi-agent framework coupling clinical priors with task-aware multimodal reasoning and radiologist-like workflows, plus grounding and retrieval-augmentation for conflict resolution in chest X-ray interpretation.
M4CXR is a multi-modal large language model that performs multiple tasks in chest X-ray analysis including report generation with claimed SOTA clinical accuracy using chain-of-thought prompting.
A unified transformer performs four clinical tasks on chest X-rays and generates reports rated comparable to human ones in 66% of cases by radiologists.
citing papers explorer
-
CCS: Clinical Consensus Selection for Radiology Report Generation
CCS selects the best radiology report from multiple MLLM candidates by measuring clinical consensus with combined text and multimodal embedding utilities, yielding gains over single-path and Best-of-N baselines on clinical metrics across three datasets.
-
PMC-InterCPT: Rethinking Biomedical Interleaved Data for Multimodal Continued Pretraining
PMC-InterCPT builds a context-grounded biomedical interleaved corpus from PMC literature and shows it improves multimodal performance on Qwen3.5-4B-Base after CPT and SFT while using fewer tokens.