Medheval: Benchmarking hallucinations and mitigation strategies in medical large vision–language models

[Changet al · 2025

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Med-StepBench: A Hierarchical Reasoning Framework for Evaluating Hallucinations in Medical Vision-Language Models

cs.CV · 2026-05-11 · unverdicted · novelty 7.0

Med-StepBench is the first large-scale step-wise hallucination benchmark for 3D oncological PET/CT that decomposes clinical reasoning into four stages and reveals systematic VLM failures hidden by aggregate metrics.

citing papers explorer

Showing 1 of 1 citing paper.

Med-StepBench: A Hierarchical Reasoning Framework for Evaluating Hallucinations in Medical Vision-Language Models cs.CV · 2026-05-11 · unverdicted · none · ref 6
Med-StepBench is the first large-scale step-wise hallucination benchmark for 3D oncological PET/CT that decomposes clinical reasoning into four stages and reveals systematic VLM failures hidden by aggregate metrics.

Medheval: Benchmarking hallucinations and mitigation strategies in medical large vision–language models

fields

years

verdicts

representative citing papers

citing papers explorer