SiMing-Bench shows current MLLMs have weak agreement with physicians on procedural correctness in clinical videos, with intermediate step judgments remaining poor even when overall scores look acceptable.
arXiv preprint arXiv:2602.23363 (2026)
3 Pith papers cite this work. Polarity classification is still indexing.
3
Pith papers citing it
citation-role summary
baseline 1
citation-polarity summary
fields
cs.CV 3years
2026 3roles
baseline 1polarities
baseline 1representative citing papers
CogAlign uses hierarchical supervised fine-tuning on clinical cognition data plus counterfactual RL to align MLLMs with expert diagnostic pathways and enforce causal lesion grounding for GI endoscopy diagnosis.
citing papers explorer
-
SiMing-Bench: Evaluating Procedural Correctness from Continuous Interactions in Clinical Skill Videos
SiMing-Bench shows current MLLMs have weak agreement with physicians on procedural correctness in clinical videos, with intermediate step judgments remaining poor even when overall scores look acceptable.
-
Clinical Cognition Alignment for Gastrointestinal Diagnosis with Multimodal LLMs
CogAlign uses hierarchical supervised fine-tuning on clinical cognition data plus counterfactual RL to align MLLMs with expert diagnostic pathways and enforce causal lesion grounding for GI endoscopy diagnosis.
- MedSynapse-V: Bridging Visual Perception and Clinical Intuition via Latent Memory Evolution