Four of five tested LLMs showed better diagnostic reasoning and accuracy when prompted in English than in French on physician-scored clinical vignettes.
Better to ask in english: Cross-lingual evaluation of large language models for healthcare queries
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Prompting language influences diagnostic reasoning and accuracy of large language models
Four of five tested LLMs showed better diagnostic reasoning and accuracy when prompted in English than in French on physician-scored clinical vignettes.