However, these approaches are often computationally expensive and difficult to deploy efficiently at scale

incorporate structured reasoning or external visual tools to improve perception under ambiguous visual conditions · 2023

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Multilingual OCR-Aware Fine-Tuning and Prompt-Guided Chain-of-Thought Reasoning for Multimodal Large Language Models

cs.CV · 2026-05-13 · unverdicted · novelty 4.0

An OCR-aware multilingual framework combining synthetic data generation, LoRA SFT, and visual CoT prompting improves text extraction and translation robustness in multimodal LLMs on degraded images.

citing papers explorer

Showing 1 of 1 citing paper.

Multilingual OCR-Aware Fine-Tuning and Prompt-Guided Chain-of-Thought Reasoning for Multimodal Large Language Models cs.CV · 2026-05-13 · unverdicted · none · ref 2
An OCR-aware multilingual framework combining synthetic data generation, LoRA SFT, and visual CoT prompting improves text extraction and translation robustness in multimodal LLMs on degraded images.

However, these approaches are often computationally expensive and difficult to deploy efficiently at scale

fields

years

verdicts

representative citing papers

citing papers explorer