A Dataset of Robot-Patient and Doctor-Patient Medical Dialogues for Spoken Language Processing Tasks

· 2026 · cs.AI · arXiv 2605.26747

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

Large Language Models (LLMs) have brought huge improvements to Artificial Intelligence (AI), which can be applied to general-purpose tasks. However, their application to textual or spoken medical consultations is still an open research problem. This paper proposes MeDial-Speech, a novel speech dataset for training and evaluating Med-AIs that can carry out consultations with patients. It was collected in realistic environments from robot-patient and doctor-patient dialogues, contains 111+ hours of speech data (without data augmentation), and covers four health conditions: Lewy body dementia, heart failure, shoulder pain, and angina. In addition, we propose a dialogue benchmark via sentence selection (with 20 options) to evaluate three state-of-the-art LLMs: GPT-5 mini, DeepSeek-V3, and Claude Sonnet 4. Experimental results reveal that Claude Sonnet 4 is the best in sentence selection, with 71.1% accuracy using manual transcriptions and 74.7% using automatic transcriptions, and that all LLMs are highly overconfident in their probabilistic predictions, regardless of selecting correct or incorrect sentences in medical dialogues. This dataset is free of charge for non-commercial purposes at: https://huggingface.co/datasets/hcuayahu/MeDial-Speech

representative citing papers

A Dataset of Robot-Patient and Doctor-Patient Medical Dialogues for Spoken Language Processing Tasks

cs.AI · 2026-05-26 · unverdicted · novelty 6.0

MeDial-Speech provides 111+ hours of spoken medical dialogues from robot-patient and doctor-patient interactions across four conditions, with a 20-option sentence selection benchmark where Claude Sonnet 4 reaches 71-75% accuracy.

citing papers explorer

Showing 1 of 1 citing paper after filters.

A Dataset of Robot-Patient and Doctor-Patient Medical Dialogues for Spoken Language Processing Tasks cs.AI · 2026-05-26 · unverdicted · none · ref 2 · internal anchor
MeDial-Speech provides 111+ hours of spoken medical dialogues from robot-patient and doctor-patient interactions across four conditions, with a 20-option sentence selection benchmark where Claude Sonnet 4 reaches 71-75% accuracy.

A Dataset of Robot-Patient and Doctor-Patient Medical Dialogues for Spoken Language Processing Tasks

fields

years

verdicts

representative citing papers

citing papers explorer