Empirical benchmark of GPT-4o, Gemini 2.5 Flash, and Qwen 2.5 7B finds superior OCR performance over EasyOCR but inconsistent gains in overall PHI detection accuracy, with strongest improvements on complex imprint patterns.
The text local- ization module identifies text areas within images, while the text extraction module serves as an OCR engine, converting pixel-level text into machine-encoded text
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Towards Selection of Large Multimodal Models as Engines for Burned-in Protected Health Information Detection in Medical Images
Empirical benchmark of GPT-4o, Gemini 2.5 Flash, and Qwen 2.5 7B finds superior OCR performance over EasyOCR but inconsistent gains in overall PHI detection accuracy, with strongest improvements on complex imprint patterns.