hub

Maira-2: Grounded radiology report gener- ation

Shruthi Bannur, Kenza Bouzid, Daniel C Castro, Anton Schwaighofer, Sam Bond-Taylor, Maximilian Ilse, Fernando Perez-Garcia, V alentina Salvatelli, Harshita Sharma, Felix Meissen, et al · 2024 · arXiv 2406.04449

19 Pith papers cite this work. Polarity classification is still indexing.

19 Pith papers citing it

read on arXiv browse 19 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 3 other 1

citation-polarity summary

background 3 unclear 1

representative citing papers

CheXTemporal: A Dataset for Temporally-Grounded Reasoning in Chest Radiography

cs.CV · 2026-05-11 · accept · novelty 8.0

CheXTemporal supplies paired chest X-rays with explicit temporal progression taxonomy and spatial grounding to benchmark and improve models on longitudinal reasoning tasks.

SHOVIR: A Benchmark for Evaluating Vision Shortcut Learning in Radiology Report Generation

cs.CV · 2026-06-29 · unverdicted · novelty 7.0

SHOVIR is a benchmark extending MIMIC-CXR and PadChest-GR with per-box labels and occlusion tests to isolate direct and contextual vision shortcuts in VLMs for radiology report generation.

Transition-Aware best-of-N sampling for Longitudinal Chest X-ray Reports

cs.CV · 2026-06-23 · unverdicted · novelty 7.0

Transition-aware best-of-N sampling embeds report sentences as sets, computes directional transition vectors via set-to-set distances, and scores candidates by proximity to ground-truth training transitions.

Astra: a generalizable report generation foundation model for 3D computed tomography

cs.CV · 2026-05-29 · unverdicted · novelty 6.0

Astra is a 3D CT vision-language foundation model trained on 90,678 thoracoabdominal scans that claims 44.1% better diagnostic metrics on internal and six external cohorts plus 29.6% faster chest reporting in real workflows.

CCS: Clinical Consensus Selection for Radiology Report Generation

cs.CL · 2026-05-28 · unverdicted · novelty 6.0

CCS selects the best radiology report from multiple MLLM candidates by measuring clinical consensus with combined text and multimodal embedding utilities, yielding gains over single-path and Best-of-N baselines on clinical metrics across three datasets.

Concept-Guided Noisy Negative Suppression for Zero-Shot Classification and Grounding of Chest X-Ray Findings

cs.CV · 2026-05-19 · unverdicted · novelty 6.0

CoNNS uses an LLM-built concept ontology and cross-patient relabeling to filter noisy negatives, improving zero-shot classification and grounding of chest X-ray findings over prior methods.

Medical Context Distorts Decisions in Clinical Vision Language Models

cs.CV · 2026-05-17 · unverdicted · novelty 6.0

Clinical VLMs over-rely on text modality, irrelevant clinical history, and prompt wording when making chest x-ray decisions on MIMIC-CXR data.

Spectral Vision Transformer for Efficient Tokenization with Limited Data

cs.CV · 2026-05-12 · unverdicted · novelty 6.0

A spectral vision transformer achieves equitable or superior performance with fewer parameters than standard ViTs, CNNs, and other models by using spectral projections for tokenization in limited-data medical imaging.

Enhancing Fine-Grained Spatial Grounding in 3D CT Report Generation via Discriminative Guidance

cs.CV · 2026-04-12 · unverdicted · novelty 6.0

DCP-PD improves macro F1 scores on CT report generation benchmarks and introduces a hierarchical location-aware evaluation protocol that reveals ongoing challenges in pathology spatial grounding.

Multi-Modal Multi-Agent Reinforcement Learning for Radiology Report Generation

cs.CV · 2026-02-17 · unverdicted · novelty 6.0

MARL-Rad trains region-specific and global agents with reinforcement learning on clinical rewards to produce more accurate radiology reports than prior methods on MIMIC-CXR and IU X-ray datasets.

RA-RRG: Multimodal Retrieval-Augmented Radiology Report Generation with Key Phrase Extraction

cs.CV · 2025-04-10 · unverdicted · novelty 6.0

RA-RRG extracts key phrases with LLMs, retrieves them via multimodal similarity, and conditions report generation on them to achieve SOTA CheXbert scores and competitive RadGraph F1 on MIMIC-CXR and IU X-ray while supporting multi-view inputs.

PMC-InterCPT: Rethinking Biomedical Interleaved Data for Multimodal Continued Pretraining

cs.CL · 2026-05-31 · unverdicted · novelty 5.0

PMC-InterCPT builds a context-grounded biomedical interleaved corpus from PMC literature and shows it improves multimodal performance on Qwen3.5-4B-Base after CPT and SFT while using fewer tokens.

RadGenome-Anatomy: A Large-Scale Anatomy-Labeled Chest Radiograph Dataset via Physically Grounded Volumetric Projection

cs.CV · 2026-05-17 · unverdicted · novelty 5.0

RadGenome-Anatomy is a large-scale chest radiograph dataset with anatomy labels obtained by projecting 3D CT masks into 2D radiographic space for 210 structures in 25,692 studies.

MedMIX: Modality-Internal Expert Fusion for Multimodal Medical Diagnosis

cs.LG · 2026-05-15 · unverdicted · novelty 5.0

MedMIX combines intra-modality expert fusion, learned inter-modality fusion, and training-only large-small collaboration to deliver robust multimodal medical prediction under incomplete modalities across three benchmarks.

Beyond Masks: The Case for Medical Image Parsing

cs.CV · 2026-05-12 · unverdicted · novelty 5.0

Medical image parsing is proposed as the central output for the field instead of masks, with an audit showing that none of eleven representative systems produces a well-formed parse containing attributes, relationships, and closure.

LoFi: Location-Aware Fine-Grained Representation Learning for Chest X-ray

cs.CV · 2026-03-19 · unverdicted · novelty 5.0

LoFi adds location-aware captioning loss to jointly optimize fine-grained representations, yielding better retrieval and grounding on MIMIC-CXR and PadChest-GR.

RadAgents: Multimodal Agentic Reasoning for Chest X-ray Interpretation with Radiologist-like Workflows

cs.MA · 2025-09-24 · unverdicted · novelty 5.0

RadAgents is a multi-agent framework coupling clinical priors with task-aware multimodal reasoning and radiologist-like workflows, plus grounding and retrieval-augmentation for conflict resolution in chest X-ray interpretation.

M4CXR: Exploring Multi-task Potentials of Multi-modal Large Language Models for Chest X-ray Interpretation

cs.CV · 2024-08-29 · unverdicted · novelty 5.0

M4CXR is a multi-modal large language model that performs multiple tasks in chest X-ray analysis including report generation with claimed SOTA clinical accuracy using chain-of-thought prompting.

A unified multi-task framework enables interpretable chest radiograph analysis

cs.CV · 2026-06-02 · unverdicted · novelty 4.0

A unified transformer performs four clinical tasks on chest X-rays and generates reports rated comparable to human ones in 66% of cases by radiologists.

citing papers explorer

Showing 2 of 2 citing papers after filters.

CCS: Clinical Consensus Selection for Radiology Report Generation cs.CL · 2026-05-28 · unverdicted · none · ref 1
CCS selects the best radiology report from multiple MLLM candidates by measuring clinical consensus with combined text and multimodal embedding utilities, yielding gains over single-path and Best-of-N baselines on clinical metrics across three datasets.
PMC-InterCPT: Rethinking Biomedical Interleaved Data for Multimodal Continued Pretraining cs.CL · 2026-05-31 · unverdicted · none · ref 3
PMC-InterCPT builds a context-grounded biomedical interleaved corpus from PMC literature and shows it improves multimodal performance on Qwen3.5-4B-Base after CPT and SFT while using fewer tokens.

Maira-2: Grounded radiology report gener- ation

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer