How far are we to gpt-4v? closing the gap to commercial multimodal models with open-source suites, 2024

Zhe Chen, Weiyun Wang, Hao Tian, Shenglong Ye, Zhangwei Gao, Erfei Cui, Wenwen Tong, Kongzhi Hu, Jiapeng Luo, Zheng Ma, Ji Ma, Jiaqi Wang, Xiaoyi Dong, Hang Yan, Hewei Guo, Conghui He, Botian Shi, Zhenjiang Jin, Chao Xu, Bin Wang, Xingjian · 2024

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

citation-role summary

baseline 1

citation-polarity summary

baseline 1

representative citing papers

Focusing Where Vision Matters: Selective Training for Large Vision Language Models via Visual Information Gain

cs.CV · 2026-02-19 · unverdicted · novelty 7.0

Introduces VIG metric to measure visual contribution via perplexity reduction and applies it for selective training of LVLMs on high-VIG samples and tokens to improve grounding with reduced supervision.

OCRBench: On the Hidden Mystery of OCR in Large Multimodal Models

cs.CV · 2023-05-13 · accept · novelty 6.0

OCRBench provides the largest evaluation suite yet for OCR capabilities in large multimodal models, revealing gaps in multilingual, handwritten, and mathematical text handling.

citing papers explorer

Showing 2 of 2 citing papers.

Focusing Where Vision Matters: Selective Training for Large Vision Language Models via Visual Information Gain cs.CV · 2026-02-19 · unverdicted · none · ref 5
Introduces VIG metric to measure visual contribution via perplexity reduction and applies it for selective training of LVLMs on high-VIG samples and tokens to improve grounding with reduced supervision.
OCRBench: On the Hidden Mystery of OCR in Large Multimodal Models cs.CV · 2023-05-13 · accept · none · ref 64
OCRBench provides the largest evaluation suite yet for OCR capabilities in large multimodal models, revealing gaps in multilingual, handwritten, and mathematical text handling.

How far are we to gpt-4v? closing the gap to commercial multimodal models with open-source suites, 2024

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer