RIHA proposes a hierarchical alignment transformer that uses multi-scale visual and textual feature pyramids plus optimal transport to generate more accurate radiology reports from medical images.
Promptmrg: Diagnosis-driven prompts for medical report generation
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
Transferring a 2D MLLM to 3D CT inputs via parameter reuse, a Text-Guided Hierarchical MoE framework, and two-stage training yields better performance than prior 3D medical MLLMs on medical report generation and visual question answering.
CPO++ adapts reinforcement fine-tuning of MLLMs to endogenous multi-modal concept drift through counterfactual reasoning and preference optimization, yielding better coherence and cross-domain robustness in safety-critical settings.
citing papers explorer
-
RIHA: Report-Image Hierarchical Alignment for Radiology Report Generation
RIHA proposes a hierarchical alignment transformer that uses multi-scale visual and textual feature pyramids plus optimal transport to generate more accurate radiology reports from medical images.
-
Adapting 2D Multi-Modal Large Language Model for 3D CT Image Analysis
Transferring a 2D MLLM to 3D CT inputs via parameter reuse, a Text-Guided Hierarchical MoE framework, and two-stage training yields better performance than prior 3D medical MLLMs on medical report generation and visual question answering.
-
Towards Robust Endogenous Reasoning: Unifying Drift Adaptation in Non-Stationary Tuning
CPO++ adapts reinforcement fine-tuning of MLLMs to endogenous multi-modal concept drift through counterfactual reasoning and preference optimization, yielding better coherence and cross-domain robustness in safety-critical settings.