Using GPT-5.4 to clean labels in the CT-RATE chest CT dataset revealed 3.6% discordance with original labels, with radiologists supporting the LLM labels in 74-92% of reviewed cases.
Radiology 307:e230725
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2representative citing papers
LLaMA 3.1 extracts visual rating scores from Dutch neuroradiology reports with 87-96% balanced accuracy but only 66-80% on numerical counts, with few-shot prompting raising the latter to 81-92%.
citing papers explorer
-
Automatic Extraction of Structured Information from Brain MRI Reports Using an Open-Weight Large Language Model
LLaMA 3.1 extracts visual rating scores from Dutch neuroradiology reports with 87-96% balanced accuracy but only 66-80% on numerical counts, with few-shot prompting raising the latter to 81-92%.