Empirical benchmark of GPT-4o, Gemini 2.5 Flash, and Qwen 2.5 7B finds superior OCR performance over EasyOCR but inconsistent gains in overall PHI detection accuracy, with strongest improvements on complex imprint patterns.
Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Towards Selection of Large Multimodal Models as Engines for Burned-in Protected Health Information Detection in Medical Images
Empirical benchmark of GPT-4o, Gemini 2.5 Flash, and Qwen 2.5 7B finds superior OCR performance over EasyOCR but inconsistent gains in overall PHI detection accuracy, with strongest improvements on complex imprint patterns.