InternVL 2.5 is the first open-source MLLM to surpass 70% on the MMMU benchmark via model, data, and test-time scaling, with a 3.7-point gain from chain-of-thought reasoning.
Offline handwritten chinese text recognition with convolutional neural networks
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
dataset 1
citation-polarity summary
fields
cs.CV 2verdicts
UNVERDICTED 2roles
dataset 1polarities
use dataset 1representative citing papers
Sequence-level modeling, not shared visual features, explains cross-language transfer improvements in low-resource Arabic-script HTR.
citing papers explorer
-
Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling
InternVL 2.5 is the first open-source MLLM to surpass 70% on the MMMU benchmark via model, data, and test-time scaling, with a 3.7-point gain from chain-of-thought reasoning.
-
Understanding Cross-Language Transfer Improvements in Low-Resource HTR: The Role of Sequence Modeling
Sequence-level modeling, not shared visual features, explains cross-language transfer improvements in low-resource Arabic-script HTR.